US Application No. 10/681,878 
\ \ Clean Copy of Substitute Specification 

PRODUCTION OF SYRINGYL LIGNIN IN GYMNOSPERMS 
Field of the Invention 

[0001] This application claims the benefit of U.S. Provisional Application No. 60/033,381, 
filed Dec. 16, 1996. The invention relates to the molecular modification of gymnosperms in 
order to cause the production of syringyl units during lignin biosynthesis and to production 
and propagation of gymnosperms containing syringyl lignin. 
Background of the Invention 

[0002] Lignin is a major part of the supportive structure of most woody plants including 
angiosperm and gymnosperm trees which in turn are the principal sources of fiber for making 
paper and cellulosic products. In order to liberate fibers from wood structure in a manner 
suitable for making many grades of paper, it is necessary to remove much of the lignin from 
the fiber/lignin network. Lignin is removed from wood chips by treatment of the chips in an 
alkaline solution at elevated temperatures and pressure in an initial step of papermaking 
processes. The rate of removal of lignin from wood of different tree species varies depending 
upon lignin structure. Three different lignin structures have been identified in trees: p- 
hydroxyphenyl, guaiacyl and syringyl, which are illustrated in FIG. 1 . 
[0003] Angiosperm species, such as Liquidambar styraciflua L. [sweetgum], have lignin 
composed of a mixture of guaiacyl and syringyl monomer units. In contrast, gymnosperm 
species such as Pinus taeda L. [loblolly pine] have lignin which is devoid of syringyl 
monomer units. Generally speaking, the rate of delignification in a pulping process is directly 
proportional to the amount of syringyl lignin present in the wood. The higher delignification 
rates associated with species having a greater proportion of syringyl lignin result in more 
efficient pulp mill operations since the mills make better use of energy and capital investment 
and the environmental impact is lessened due to a decrease in chemicals used for 
delignification. 
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[0004] It is therefore an object of the invention to provide gymnosperm species which are 
easier to delignify in pulping processes. 

[0005] Another object of the invention is to provide gymnosperm species such as loblolly 
pine which contain syringyl lignin. 

[0006] An additional object of the invention is to provide a method for modifying genes 
involved in lignin biosynthesis in gymnosperm species so that production of syringyl lignin is 
increased while production of guaiacyl lignin is suppressed. 

[0007] Still another object of the invention is to produce whole gymnosperm plants 
containing genes which increase production of syringyl lignin and repress production of 
guaiacyl lignin. 

[0008] Yet another object of the invention is to identify, isolate and/or clone those genes in 
angiosperms responsible for production of syringyl lignin. 

[0009] A further object of the invention is to provide, in gymnosperms, genes which 
produce syringyl lignin. 

[0010] Another object of the invention is to provide a method for making an expression 
cassette insertable into a gymnosperm cell for the purpose of inducing formation of syringyl 
lignin in a gymnosperm plant derived from the cell. 
Definitions 

[0011] The term "promoter" refers to a DNA sequence in the 5' flanking region of a given 
gene which is involved in recognition and binding of RNA polymerase and other 
transcriptional proteins and is required to initiate DNA transcription in cells. 
[0012] The term "constitutive promoter" refers to a promoter which activates transcription 
of a desired gene, and is commonly used in creation of an expression cassette designed for 
preliminary experiments relative to testing of gene function. An example of a constitutive 
promoter is 35S CaMV, available from Clonetech. 
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[0013] The term "expression cassette" refers to a double stranded DNA sequence which 
contains both promoters and genes such that expression of a given gene is acheived upon 
insertion of the expression cassette into a plant cell. 

[0014] The term "plant" includes whole plants and portions of plants, including plant 
organs (e.g. roots, stems, leaves, etc.) 

[0015] The term "angiosperm" refers to plants which produce seeds encased in an ovary. A 
specific example of an angiosperm is Liquidambar styraciflua (L.)[sweetgum]. The 
angiosperm sweetgum produces syringyl lignin. 

[0016] The term "gymnosperm" refers to plants which produce naked seeds, that is, seeds 
which are not encased in an ovary. A specific example of a gymnosperm is Pinus taeda 
(L.)[loblolly pine]. The gymnosperm loblolly pine does not produce syringyl lignin. 
Summary of the Invention 

[0017] With regard to the above and other objects, the invention provides a method for 
inducing production of syringyl lignin in gymnosperms and to gymnosperms which contain 
syringyl lignin for improved delignification in the production of pulp for papermaking and 
other applications. In accordance with one of its aspects, the invention involves cloning an 
angiosperm DNA sequence which codes for enzymes involved in production of syringyl 
lignin monomer units, fusing the angiosperm DNA sequence to a lignin promoter region to 
form an expression cassette, and inserting the expression cassette into a gymnosperm 
genome. 

[0018] Enzymes required for production of syringyl lignin in an angiosperm are obtained 
by deducing an amino acid sequence of the enzyme, extrapolating an mRNA sequence from 
the amino acid sequence, constructing a probe for the corresponding DNA sequence and 
cloning the DNA sequence which codes for the desired enzyme. A promoter region specific 
to a gymnosperm lignin biosynthesis gene is identified by constructing a probe for a 
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gymnosperm lignin biosynthesis gene, sequencing the 5' flanking region of the DNA which 
encodes the gymnosperm lignin biosynthesis gene to locate a promoter sequence, and then 
cloning that sequence. 

[0019] An expression cassette is constructed by fusing the angiosperm syringyl lignin DNA 
sequence to the gymnosperm promoter DNA sequence. Alternatively, the angiosperm 
syringyl lignin DNA is fused to a constitutive promoter to form an expression cassette. The 
expression cassette is inserted into the gymnosperm genome to transform the gymnosperm 
genome. Cells containing the transformed genome are selected and used to produce a 
transformed gymnosperm plant containing syringyl lignin. 

[0020] In accordance with the invention, the angiosperm gene sequences bi-OMT, 4CL, 
P450-1 and P-450-2 have been determined and isolated as associated with production of 
syringyl lignin in sweetgum and lignin promoter regions for the gymnosperm loblolly pine 
have been determined to be the 5' flanking regions for the 4CL1B, 4CL3B and PAL 
gymnosperm lignin genes. Expression cassettes containing sequences of selected genes from 
sweetgum have been inserted into loblolly pine embryogenic cells and presence of sweetgum 
genes associated with production of syringyl lignin has been confirmed in daughter cells of 
the resulting loblolly pine embryogenic cells. 

[0021] The invention therefore enables production of gymnosperms such as loblolly pine 
containing genes which code for production of syringyl lignin, to thereby produce in such 
species syringyl lignin in the wood structure for enhanced pulpability. 
Brief Description of the Drawings 

[0022] The above and other aspects of the invention will now be further described in the 
following detailed specification considered in conjunction with the following drawings in 
which: 

[0023] FIG. 1 illustrates a generalized pathway for lignin synthesis; and 
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[0024] FIGS. 2A-2E illustrate a bifunctional-O-methyl transferase (bi-OMT) gene sequence 

involved in the production of syringyl lignin in an angiosperm (SEQ ID 5 coding SEQ ID 6); 

[0025] FIGS. 3A-3G illustrate a 4-coumarate CoA ligase ( 4CL) gene sequence involved in 

the production of syringyl lignin in an angiosperm (SEQ ID 7 coding SEQ ID 8); 

[0026] FIG. 4 illustrates a ferulic acid-5 -hydroxylase (P450-1) gene sequence involved in 

the production of syringyl lignin in an angiosperm (SEQ ID 1 coding SEQ ID 2); 

[0027] FIG. 5 illustrates a ferulic acid-5 -hydroxylase (P450-2) gene sequence involved in 

the production of syringyl lignin in an angiosperm (SEQ ID 3 coding SEQ ID 4); 

[0028] FIG. 6 illustrates nucleotide sequences of the 5' flanking region of the loblolly pine 

4CL1B gene showing the location of regulatory elements for lignin biosynthesis (SEQ ID 

10); 

[0029] FIGS 7A-7B illustrate nucleotide sequences of the 5' flanking region of the loblolly 
pine 4CL3B gene showing the location of regulatory elements for lignin biosynthesis (SEQ 
ID 11); 

[0030] FIGS. 8A-8B illustrate nucleotide sequences of the 5' flanking region of loblolly 
pine PAL gene showing the location of regulatory elements for lignin biosynthesis (SEQ ID 
9); 

[0031] FIG. 9 illustrates a PCR confirmation of the sweetgum P450-1 gene sequence in 
transgenic loblolly pine cells; and 
Detailed Descrioption of the Invention 

[0032] In accordance with the invention, a method is provided for modifying a 
gymnosperm genome, such as the genome of a loblolly pine, so that syringyl lignin will be 
produced in the resulting plant, thereby enabling cellulosic fibers of the same to be more 
easily separated from lignin in a pulping process. In general, this is accomplished by fusing 
one or more angiosperm DNA sequences (referred to at times herein as the " ASL DNA 
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sequences") which are involved in production of syringyl lignin to a gymnosperm lignin 
promoter region (referred to at times herein as the "GL promoter region") specific to genes 
involved in gymnosperm lignin biosynthesis to form a gymnosperm syringyl lignin 
expression cassette (referred to at times herein as the "GSL expression cassette"). 
Alternatively, the one or more ASL DNA sequences are fused to one or more constitutive 
promoters to form a GSL expression cassette. 

[0033] The GSL expression cassette preferably also includes selectable marker genes which 
enable transformed cells to be differentiated from untransformed cells. The GSL expression 
cassette containing selectable marker genes is inserted into the gymnosperm genome and 
transformed cells are identified and selected, from which whole gymnosperm plants may be 
produced which exhibit production of syringyl lignin. 

[0034] To suppress production of less preferred forms of lignin in gymnosperms, such as 
guaiacyl lignin, genes from the gymnosperm associated with production of these less 
preferred forms of lignin are identified, isolated and the DNA sequence coding for anti-sense 
mRNA (referred to at times herein as the "GL anti-sense sequence") for these genes is 
produced. The DNA sequence coding for anti-sense mRNA is then incorporated into the 
gymnosperm genome, which when expressed bind to the less preferred guaiacyl gymnosperm 
lignin mRNA, inactivating it. 

[0035] Further features of these and various other steps and procedures associated with 
practice of the invention will now be described in more detail beginning with identification 
and isolation of ASL DNA sequences of interest for use in inducing production of syringyl 
lignin in a gymnosperm. 

I. Determination Of DNA Sequence For Genes Associated With Production Of Syringyl 
Lignin 

[0036] The general biosynthetic pathway for production of lignin has been postulated as 
shown in FIG. 1 . From FIG. 1 , it can be seen that the genes CCL, OMT and F5H (which is 
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from the class of P450 genes) may play key roles in production of syringyl lignin in some 
plant species, but their specific contributions and mechanisms remain to be positively 
established. It is suspected that the CCL, OMT and F5H genes may have specific equivalents 
in a specific angiosperm, such as sweetgum. Accordingly, one aim of the present invention is 
to identify, sequence and clone specific genes of interest from an angiosperm such as 
sweetgum which are involved in production of syringyl lignin and to then introduce those 
genes into the genome of a gymnosperm, such as loblolly pine, to induce production of 
syringyl lignin. 

[0037] Genes of interest may be identified in various ways, depending on how much 
information about the gene is already known. Genes believed to be associated with 
production of syringyl lignin have already been sequenced from a few angiosperm species, 
viz, CCL and OMT. 

[0038] DNA sequences of the various CCL and OMT genes are compared to each other to 
determine if there are conserved regions. Once the conserved regions of the DNA sequences 
are identified, oligo-dT primers homologous to the conserved sequences are synthesized. 
Reverse transcription of the DNA-free total RNA which was purified from sweetgum xylem 
tissue, followed by double PCR using gene-specific primers, enables production of probes for 
the CCL and OMT genes. 

[0039] A sweetgum cDNA library is constructed in a host, such as lambda ZAPII, available 
from Stratagene, of LaJolla, Calif., using poly(A)+RNA isolated from sweetgum xylem, 
according to the methods described by Bugos et al. (1995 Biotechniques 19:734-737). The 
above mentioned probes are used to assay the sweetgum cDNA library to locate cDNA which 
codes for enzymes involved in production of syringyl lignin. Once a syringyl lignin sequence 
is located, it is then cloned and sequenced according to known methods which are familiar to 
those of ordinary skill. 
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[0040] In accordance with the invention, two sweetgum syringyl lignin genes have been 
determined using the above-described technique. These genes have been designated 4CL and 
bi-OMT. The sequence obtained for the sweetgum syringyl lignin gene, designated bi-OMT, 
is illustrated in FIG. 2 (SEQ ID 5 and 6). The sequence obtained for the sweetgum syringyl 
lignin gene, designated 4CL, is illustrated in FIG. 3 (SEQ ID 7 and 8). 
[0041] An alternative procedure was employed to identify the F5H equivalent genes in 
sweetgum. Because the DNA sequences for similar P450 genes from other plant species were 
known, probes for the P450 genes were designed based on the conserved regions found by 
comparing the known sequences for similar P450 genes. The known P450 sequences used for 
comparison include all plant P450 genes in the GenBank database. Primers were designed 
based on two highly conserved regions which are common to all known plant P450 genes. 
The primers were then used in a PCR reaction with the sweetgum cDNA library as a 
template. Once P450-like fragments were located, they were amplified using standard PCR 
techniques, cloned into a pBluescript vector available from Clonetech of Palo Alto, Calif, and 
transformed into a DH5. alpha. E. coli strain available from Gibco BRL of Gaithersburg, Md. 
[0042] After E. coli colonies were tested in order to determine that they contained the 
P450-like DNA fragments, the fragments were sequenced. Several P450-like sequences were 
located in sweetgum using the above described technique. One P450-like sequence was 
sufficiently different from other known P450 sequences to indicate that it represented a new 
P450 gene family. This potentially new P450 cDNA fragment was used as a probe to screen a 
full length clone from the sweetgum xylem library. These putative hydroxylase P450clones 
were designated P450-1 and P450-2. The sequence obtained for P450-1 and P450-2 are 
illustrated in FIG. 4 (SEQ ID 1 and 2) and FIG. 5 (SEQ ID 3 and 4). 
II. Identification Of GL Gene Promoter Regions 
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[0043] In order to locate gymnosperm lignin promoter regions, probes are developed to 
locate lignin genes. After the-gymnosperm lignin gene is located, the portion of DNA 
upstream from the gene is sequenced, preferably using the Genome Walker Kit, available 
from Clonetech. The portion of DNA upstream from the lignin gene will generally contain 
the gymnosperm lignin promoter region. 

[0044] Gymnosperm genes of interest include CCL-like genes and PAL-like genes, which 
are beleived to be involved in the production of lignin in gymnosperms. Preferred probe 
sequences are developed based on previously sequenced genes, which are available from the 
gene bank. The preferred gene bank accession numbers for the CCL-like genes include 
U39404 and U39405. A preferred gene bank accession number for a PAL-like gene is 
U39792. Probes for such genes are constructed according to methods familiar to those of 
ordinary skill in the art. A genomic DNA library is constructed and DNA fragments which 
code for gymnosperm lignin genes are then identified using the above mentioned probes. A 
preferred DNA library is obtained from the gymnosperm, Pinus taeda (L.)[Loblolly Pine], 
and a preferred host of the genomic library is Lambda Dashll, available from Stratagene of 
LaJolla, Calif. 

[0045] Once the DNA fragments which code for the gymnosperm lignin genes are located, 
the genomic region upstream from the gymnosperm lignin gene (the 5' flanking region) was 
identified. This region contains the GL promoter. Three promoter regions were located from 
gymnosperm lignin biosynthesis genes. The first is the 5' flanking region of the loblolly pine 
4CL1B gene, shown in FIG. 6 (SEQ ID 10). The second is the 5' flanking region of the 
loblolly pine gene 4CL3B, shown in FIG. 7 (SEQ ID 1 1). The third is the 5' flanking region 
of the loblolly pine gene PAL, shown in FIG. 8 (SEQ ID 9). 
III. Fusing The GL Promoter Region To The ASL DNA Sequence 
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[0046] The next step of the process is to fuse the GL promoter region to the ASL DNA 
sequence to make a GSL expression cassette for insertion into the genome of a gymnosperm. 
This may be accomplished by standard techniques. In a preferred method, the GL promoter 
region is first cloned into a suitable vector. Preferred vectors are pGEM7Z, available from 
Promega, Madison, Wis. and SK available from Stratagene, of LaJolla, Calif. After the 
promoter sequence is cloned into the vector, it is then released with suitable restriction 
enzymes. The ASL DNA sequence is released with the same restriction enzyme(s) and 
purified. 

[0047] The GL promoter region sequence and the ASL DNA sequence are then ligated such 
as with T4 DNA ligase, available from Promega, to form the GSL expression cassette. Fusion 
of the GL and ASL DNA sequence is confirmed by restriction enzyme digestion and DNA 
sequencing. After confirmation of GL promoter-ASL DNA fusion, the GSL expression 
cassette is released from the original vector with suitable restriction enzymes and used in 
construction of vectors for plant transformation. 

IV. Fusing The ASL DNA Sequence to a Constitutive Promoter Region 
[0048] In an alternative embodiment, a standard constitutive promoter may be fused with 
the ASL DNA sequence to make a GSL expression cassette. For example, a standard 
constitutive promoter may be fused with P450-1 to form an expression cassette for insertion 
of P450-1 sequences into a gymnosperm genome. In addition, a standard constitutive 
promoter may be fused with P450-2 to form an expression cassette for insertion of P450-2 
into a gymnosperm genome. A constitutive promoter for use in the invention is the double 
35S promoter, available from Clonetech. 

[0049] In the preferred practice of the invention using constitutive promoters, a suitable 
vector such as pBI221, is digested Xbal and Hindlll to release the 35S promoter. At the same 
time the vector pHygro, available from International Paper, was disgested by Xbal and 
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Hindlll to release the double 35S promoter. The double 35 S promoter was ligated to the 
previously digested pBI221 vector to produce a new pBI221 with the double 35S promoter. 
This new pBI221 was digested with Sad and Smal, to release the GUS fragment. The vector 
is next treated with T4 DNA polymerase to produce blunt ends and the vector is self-ligated. 
This vector is then further digested with BamHI and Xbal, available from Promega. After the 
pBI221 vector containing the constitutive promoter region has been prepared, lignin gene 
sequences are prepared for insertion into the pBI221 vector. 

[0050] The coding regions of sweetgum P450-1 or P450-2 are amplified by PCR using 
primer with restriction sites incorporated in the 5' and 3' ends. In one example, an Xbal site 
was incorporated at the 5 1 end and a BamnHI site was incorporated at the 3' end of the 
sweetgum P450-1 or P450-2 genes. After PCR, the P450-1 and P450-2 genes were separately 
cloned into a TA vector available from Invitrogen. The TA vectors containing the P450-1 and 
P450-2 genes, respectively, were digested by Xbal and BamHI to release the P450-1 or P450- 
2 sequences. 

[0051] The p35SS vector, described above, and the isolated sweetgum P450-1 or P450-2 
fragments were then ligated to make GLS expression cassettes containing the constitutive 
promoter. 

V. Inserting the Expression Cassette into the Gvmnosperm Genome 
[0052] There are a number of methods by which the GSL expression cassette may be 
inserted into a target gymnosperm cell. One method of inserting the expression cassette into 
the gymnosperm is by micro-projectile bombardment of gymnosperm cells. For example, 
embryogenic tissue cultures of loblolly pine may be initiated from immature zygotic 
embryos. Tissue is maintained in an undifferentiated state on semi-solid proliferation 
medium. For transformation, embryogenic tissue is s; suspended in liquid proliferation 
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medium. Cells are then sieved through, a preferably 40 mesh screen, to separate small, 
densely cytoplasmic cells from large vacuolar cells. 

[0053] After separation, a portion of the liquid cell suspension fraction is vacuum deposited 
onto filter paper and placed on semi-solid proliferation medium. The prepared gymnosperm 
target cells are then grown for several days on filter paper discs in a petri dish. 
[0054] A 1 : 1 mixture of plasmid DNA containing the selectable marker expression cassette 
and plasmid DNA containing the P450-1 expression cassette may be precipitated with gold to 
form microprojectiles. The microprojectiles are rinsed in absolute ethanol and aliqots are 
dried onto a suitable macrocarrier such as the macrocarrier available from BioRad in 
Hercules, Calif. 

[0055] Prior to bombardment, embryogenic tissue is preferably desiccated under a sterile 
laminar-flow hood. The desiccated tissue is transferred to semi-solid proliferation medium. 
The prepared microprojectiles are accelerated from the macrocarrier into the desiccated target 
cells using a suitable apparatus such as a BioRad PDS-1000/HE particle gun. In a preferred 
method, each plate is bombarded once, rotated 1 80 degrees, and bombarded a second time. 
Preferred bombardment parameters are 1350 psi rupture disc pressure, 6 mm distance from 
the rupture disc to macrocarrier (gap distance), 1 cm macrocarrier travel distance, and 10 cm 
distance from macrocarrier stopping screen to culture plate (microcarrier travel distance). 
Tissue is then transferred to semi-solid proliferation medium containing a selection agent, 
such as hygromycin B, for two days after bombardment. 

[0056] Other methods of inserting the GSL expression cassette include use of silicon 
carbide whiskers, transformed protoplasts, Agrobacterium vectors and electroporation. 
VI. Identifying Transformed Cells 

[0057] In general, insertion of the GSL expression cassette will typically be carried out in a 
mass of cells and it will be necessary to determine which cells harbor the recombinant DNA 
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molecule containing the GSL expression cassette. Transformed cells are first identified by 
their ability to grow vigorously on a medium containing an antibiotic which is toxic to non- 
transformed cells. Preferred antibiotics are kanamycin and hygromycin B. Cells which grow 
vigorously on antibiotic containing medium are further tested for presence of either portions 
of the plasmid vector, the syringyl lignin genes in the GSL expression cassette; e.g. the 
angiosperm bi-OMT, 4CL, P450-1 or P450-2 gene, or by testing for presence of other 
fragments in the GSL expression cassette. Specific methods which can be used to test for 
presence of portions of the GSL expression cassette include Southern blotting with a labeled 
complementary probe or PCR amplification with specific complementary primers. In yet 
another approach, an expressed syringyl lignin enzyme can be detected by Western blotting 
with a specific antibody, or by assaying for a functional property such as the appearance of 
functional enzymatic activity. 

VII. Production of a Gvmnosperm Plant from the Transformed Gvmnosperm Cell 
[0058] Once transformed embryogenic cells of the gymnosperm have been identified, 
isolated and multiplied, they may be grown into plants. It is expected that all plants resulting 
from transformed cells will contain the GSL expression cassette in all their cells, and that 
wood in the secondary growth stage of the mature plant will be characterized by the presence 
of syringyl lignin. 

[0059] Transgenic embryogenic cells are allowed to replicate and develop into a somatic 
embryo, which are then converted into a somatic seedling. 

VIII. Identification, Production and Insertion of a GL mRNA Anti-Sense Sequence 
[0060] In addition to adding ASL DNA sequences, anti-sense sequences may be 
incorporated into a gymnosperm genome, via GSL expression cassettes, in order to suppress 
formation of the less preferred native gymnosperm lignin. To this end, the gymnosperm 
lignin gene is first located and sequenced in order to determine its nucleotide sequence. 
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Methods for locating and sequencing amino acids which have been previously discussed may 
be employed. For example, if the gymnosperm lignin gene has already been purified, 
standard sequencing methods may be employed to determine the DNA nucleic acid sequence. 
[0061] If the gymnosperm lignin gene has not been purified and functionally similar DNA 
or mRNA sequences from similar species are known, those sequences may be compared to 
identify highly conserved regions and this information used as a basis for the construction of 
a probe. A gymnosperm cDNA or genomic library can be probed with the above mentioned 
sequences to locate the gymnosperm lignin cDNA or genomic DNA. Once the gymnosperm 
lignin DNA is located, it may be sequenced using standard sequencing methods. 
[0062] After the DNA sequence has been obtained for a gymnosperm lignin sequence, the 
complementary anti-sense strand is constructed and incorporated into an expression cassette. 
For example, the GL mRNA anti-sense sequence may be fused to a promoter region to form 
an expression cassette as described above. In a preferred method, the GL mRNA anti-sense 
sequence is incorporated into the previously discussed GSL expression cassette which is 
inserted into the gymnosperm genome as described above. 

IX. Inclusion of Cytochrome P450 Reductase (CPK) to Enhance Biosynthesis Of Svringyl 
Lignin in Gvmnosperms 

[0063] In the absence of external cofactors such as NADPH (an electron donor in reductive 
biosyntheses), certain angiosperm lignin genes such as the P450 genes may remain inactive 
or not acheive full or desired activity after insertion into the genome of a gymnosperm. 
Inactivity or insufficient activity can be determined by testing the resulting plant which 
contains the P450 genes for the presence of syringyl lignin in secondary growth. It is known 
that cytochrome P450 reductase (CPR) may be involved in promoting certain reductive 
biochemical reactions, and may activate the desired expression of genes in many plants. 
Accordingly, if it is desired to enhance the expression of the angiosperm syringyl lignin 
genes in the gymnosperm, CPR may be inserted in the gymnosperm genome. In order to 
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express CPR, the DNA sequence of the enzyme is ligated to a constitutive promoter or, for a 
specific species such as loblolly pine, xylem-specific lignin promoters such as PAL, 4CL1B 
or 4CL3B to form an expression cassette. The expression cassette may then be inserted into 
the gymnosperm genome by various methods as described above. 
X. Examples 

[0064] The following non-limiting examples illustrate further aspects of the invention. In 
these examples, the angiosperm is Liquidambar styraciflua (L.)[sweetgum] and the 
gymnosperm is Pinus taeda (L.)[loblolly pine]. The nomenclature for the genes referred to in 
the examples is as follows: 



Genes 


Biochemical Name 


4CL (angiosperm) 


4-coumarate CoA ligase 


bi-OMT (angiosperm) 


bifixnctional-O-methyl transferase 


FA5HP450-1 (angiosperm) 


Cytochrome P450 


P450-2 (angiosperm) 


Cytochrome P450 


PAL (gymnosperm) 


phenylalanine ammonia-lyase 


4CL1B (gymnosperm) 


4-coumarate CoA ligase 


4CL3B (gymnosperm) 


4-coumarate CoA ligase 


Example 1 - Isolating and Sequencing bi-OMT and 4CL Genes from an Angiosperm 



[0065] A cDNA library for Sweetgum was constructed in Lambda ZAPII, available from 
Stratagene, of LaJolla, Calif, using poly(A)+RNA isolated from Sweetgum xylem tissue. 
Probes for bi-OMT and 4CL were obtained through reverse transcription of their mRNAs and 
followed by double PCR using gene-specific primers which were designed based on the 
OMT and CCL cDNA sequences obtained from similar genes cloned from other species. 
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[0066] Three primers were used for amplifying OMT fragments. One was an oligo-dT 
primer. One was a bi-OMT, (which was used to clone gene fragments through modified 
differential display technique, as described below in Example 2) and the other two were 
degenerate primers, which were based on the conserved sequences of all known OMTs. The 
two degenerate primers were derived based on the following amino acid sequences: 

5'-Gly Gly Met Ala Thr Tyr Cys Cys Ala Thr Thr Tyr Ala Ala Cys Ala Ala 
Gly Gly Cys-3' (primer #22) (SEQ ID 12) and 

3-Ala Ala Ala Gly Ala Gly Ala Gly Asn Ala Cys Asn Asn Ala Asn Asn Ala 
Asn Gly Ala-5' (primer #23) (SEQ ID 13). 

[0067] A 900 bp PCR product was produced when oligo-dT primer and primer #22 were 
used, and a 550 bp fragment was produced when primer numbers 22 and 23 were used. 
[0068] Three primers were used for amplifying CCL fragments. They were derived from 
the following amino acid sequences: 

5*-Thr Thr Gly Gly Ala Thr Cys Cys Gly Gly He Ala Cys lie Ala Cys He Gly 
Gly He Tyr Thr He Cys Cys He Ala Ala Arg Gly Gly-3' (primer R1S) (SEQ ID 14) 

5'-Thr Thr Gly Gly Ala Thr Cys Cys Gly Thr He Gly Thr He Gly Cys lie Cys 
Ala Arg Cys Ala Arg Gly Thr He Gly Ala Tyr Gly Gly-3' (primer HIS) (SEQ ID 15) and 

3'-Cys Cys He Cys Thr Tyr Thr Ala Asp Ala Cys Arg Thr Ala Asp Gly Cys He 
Cys Cys Ala Gly Cys Thr Gly Thr Ala-5* (primer R2A) (SEQ ID 16) 

[0069] Rl S and HIS were both sense primers. Primer R2A was an anti-sense primer. A 650 
bp fragment was produced if R1S and R2A primers were used and a 550 bp fragment was 
produced when primers HIS and R2A were used. The sequence of these three primers were 
derived from conserved sequences for plant CCLs. 
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[0070] The reverse transcription-double PCR cloning technique used for these examples 
consisted of adding 10 ^igf DNA-free total RNA in 25 plDEPC-treated water to a microfuge 
tube. Next, the following solutions were added: 

a. 5x Reverse transcript buffer 8.0 jal, 

b. 0.1 MDTT4.0 \xl 

c. 10 mMdNTP 2.0^1 

d. 1 00 |oM oligo-dT primers 8.0 ^il 

e. Rnasin 2.0 jlxI 

f. Superscript II 1 .0 |il 

[0071] After mixing, the tube was incubated at a temperature of 42° C. for one (1) hour, 
followed by incubation at 70° C. for fifteen (15) minutes. Forty (40) \il of IN NaOH was 
added and the tube was further incubated at 68° C. for twenty (20) minutes. After the 
incubation periods, 80 yd of IN HC1 was added to the reaction mixture. At the same time, 17 
^1 NaOAc, 5 )il glycogen and 768 \xl of 100% ethanol were added and the reaction mixture 
was maintained at -80° C. for 15 minutes in order to precipitate the cDNA. The precipitated 
cDNA was centrifuged at high speed at 4° C. for 15 minutes. The resulting pellet was washed 
with 70% ethanol and then dried at room temperature, and then was dissolved in 20 |il of 
water. 

[0072] The foregoing procedure produced purified cDNA which was used as a template to 
carry out first round PCR using primers #22 and oligo-dT for cloning OMT cDNA and 
primer Rl S and R2A for cloning 4CL cDNA. For the first round PCR, a master mix of 50 |al 
for each reaction was prepared. Each 50 |xl mixture contained: 

a. lOx buffer 5 jol 

b. 25 mM MgCl 2 5 jal 

c. 100 nM sense primer 1 y.1 (primer #22 for OMT and primer R1S for CCL). 
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d. 100 jal anti-sense primer 1 jal (oligo-dT primer for OMT and R2A for CCL). 



e. lOmMdNTP 1 (al 



f. Taq. DNA polymerase 0.5 |il 



[0073] Of this master mix, 48 \il was added into a PCR tube containing 2 jjl of cDNA for 
PCR. The tube was heated to 95° C. for 45 seconds, 52° C. for one minute and 72° C. for two 
minutes. This temperature cycle was repeated for 40 cycles and the mixture was then held at 
72° C. for 10 minutes. 

[0074] The cDNA fragments obtained from the first round of PCR were used as templates 
to perform the second round of PCR using primers 22 and 23 for cloning bi-OMT cDNA and 
primer HIS and R2A for cloning 4CL cDNA. The second round of PCR conditions were the 
same as the first round. 

[0075] The desired cDNA fragment was then subcloned and sequenced. After the second 
round of PCR, the product with the predicted size was excised from the gel and ligated into a 
pUC19 vector, available from Clonetech, of Palo Alto, Calif, and then transformed into 
DH5. alpha., an E. coli strain, available from Gibco BRL, of Gaithersburg, Md. After the 
inserts had been checked for correct size, the colonies were isolated and plasmids were 
sequenced using a Sequenase kit available from USB, of Cleveland, Ohio. The sequences are 
shown in FIG. 2 (SEQ ID 5 and 6) and FIG. 3 ( SEQ ID 7 and 8). 
Example 2 - Alternative Isolation Method of Angiosperm bi-OMT Gene 
[0076] As previously mentioned, one bi-OMT clone was produced via modified differential 
display technique. This method is another type of reverse transcription-PCR, in which DNA- 
free total RNA was reverse transcribed using oligo-dT primers with a single base pair anchor 
to form cDNA. The oligo-dT primers used for reverse transcription of mRNA to synthesize 
cDNA were: 
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Tl 1C: TTTTTTTTTTTTTTC , (SEQ ID 18) and 

TUG: TTTTTTTTTTTTTTG, (SEQ ID 19) 
[0077] These cDNAs were then used as templates for radioactive PCR which was 
conducted in the presence of the same oligo-dT primers as listed above, a bi-OMT gene- 
specific primer and 35S-dATP. The OMT gene-specific primer was derived from the 
following amino acid sequence: 

5'-Cys Cys Asn Gly Gly Asn Gly Gly Ser Ala Arg Gly Ala-3\ (SE ID 20) 
[0078] The following PCR reaction solutions were combined in a microfuge tube: 

a. H 2 0 9.2 ill 

b. Taq Buffer 2.0 ^il 

c. dNTP(25 pM) 1.6 >il 

d. Primers (5 |^M) 2 |il, for each primer 

e. 35 S-dATP 1 |al 

f. Taq. pol. 0.2 jil 

g. cDNA 2.0 |xl. 

[0079] The tube was heated to a temperature of 94° C. and held for 45 seconds, then at 37° 
C. for 2 minutes and then 72° C. for 45 seconds for forty cycles, followed by a final reaction 
at 72° C. for 5 minutes. 

[0080] The amplified products were fractionated on a denaturing polyacrylamide 
sequencing gel and autoradiography was used to identify and excise the fragments with a 
predicted size. The designed OMT gene-specific primer had a sequence conserved in a region 
toward the 3*-end of the OMT cDNA sequence. This primer, together with oligo-dT, was 
amplified into a OMT cDNA fragment of about 300 bp. 

[0081] Three oligo-dTs with a single base pair of A, C or G, respectively, were used to pair 
with the OMT gene-specific primer. Eight potential OMT cDNA fragments with predicted 
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sizes of about 300 bp were excised from the gels after several independent PCR rounds using 
different combinations of oligo-dT and OMT gene-specific oligo-nucleotides as primers. 
[0082] The OMT cDNA fragments were then re-amplified. A Southern blot analysis was 
performed for the resulting cDNAs using a 360 base-pair, 32 P radio-isotope labeled, aspen 
OMT cDNA 3'-end fragment as a probe to identify the cDNA fragments having a strong 
hybridization signal, under low stringency conditions. Eight fragments were identified. Out of 
these eight cDNA fragments, three were selected based on their high hybridization signal for 
sub-cloning and sequencing. One clone, LsOMT3'-l, (where the "Ls" prefix indicates that the 
clone was derived from the Liquidambar styraciflua (L.) genome) was confirmed to encode 
bi-OMT based on its high homology to other lignin-specific plant OMTs at both nucleotide 
and amino acid sequence levels. 

[0083] A cDNA library was constructed in Lambda ZAP II, available from Stratagene, of 
LaJolla, Calif, using 5 mg poly(A)+RNA isolated from sweetgum xylem tissue. The primary 
library consisting of approximately 0.7x1 0 6 independent recombinants was amplified and 
approximately 10 5 plaque-forming-units (pfu) were screened using a homologous 550 base- 
pair probe. The hybridized filter was washed at high stringency (0.25xSSC, 0.1% SDS, 65° 
C.) conditions. The colony containing the bi-OMT fragment identified by the probe was 
eluted and the bi-OMT fragment was produced. The sequence as illustrated in FIG. 2 (SEQ 
ID 5 and 6) was obtained. 

Example 3 - Isolating and Producing the DNA which Codes for the Angiosperm P450-1 Gene 
[0084] In order to find putative P450 cDNA fragments as probes for cDNA library 
screening, a highly degenerated sense primer based on the amino acid sequence of 5'-Glu, 
Glu, Phe, Arg, Pro, Glu, Arg-3' was designed based on the conserved regions found in some 
plant P450 proteins. This conserved domain was located upstream of another highly 
conserved region in P450 proteins, which had an amino acid sequence of 5-Phe Gly Xaa Gly 



WASH_1 640601.1 



20 



Xaa Xaa Cys Xaa Gly-3' (SEQ ID 21). This primer was synthesized with the incorporation of 
an Xbol restriction site to give a 26-base-pair oligomer with a nucleotide sequence of 5' ATG 
TGC AGT TTT TTT TTT TTT TIT TT-3' (SEQ ID 22). 

[0085] This primer and the oligo-dT-XhoI primer were then used to perform PCR reactions 
with the sweetgum cDNA library as a template. The cDNA library was constructed in 
Lambda ZAPII, available from Stratagene, of LaJolla, Calif, using poly(a)+RNA isolated 
from Sweetgum xylem tissue. Amplified fragments of 300 to 600 bp were obtained. Because 
the designed primer was located upstream of the highly conserved P450 domain, this design 
distinguished whether the PCR products were P450 gene fragments depending on whether 
they contained the highly conserved amino acid domain. 

[0086] All the fragments obtained from the PCR reaction were then cloned into a pUC19 
vector, available from Stratagene, of LaJolla, Calif, and transformed into a DH5. alpha. E. 
coli strain, available from Gibco BRL, of Gaithersburg, Md. 

[0087] Twenty-four positive colonies were obtained and sequenced. Sequence analysis 
indicated four groupings within the twenty-four colonies. One was C4H, one was an 
unknown P450 gene, and two did not belong to P450 genes. Homologies of P450 genes in 
different species are usually more than 80%. Because the homologies between the P450 gene 
families found here were around 40%, the sequence analysis indicated that a new P450 gene 
family was sequenced. Moreover, since this P450 cDNA was isolated from xylem tissue, it 
was highly probable that this P450 gene was P450-1. 

[0088] The novel sweetgum P450 cDNA fragment was used as a probe to screen a full 
length cDNA encoding for P450-1 . Once the P450-1 gene was located it was sequenced. The 
length of the P450-1 cDNA is 1707 bp and it contains 45 bp of 5' non-coding region and 135 
bp of 3' non-coding region. The deduced amino acid sequence also indicates that this P450 
cDNA has a hydrophobic core at the N-terminal, which could be regarded as a leader 
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sequence for c-translational targeting to membranes during protein synthesis. At the C- 
terminal region, there is a heme binding domain that is characteristic of all P450 genes. The 
P450-1 sequence, as illustrated in FIG. 4 (SEQ ID 1 and 2), was produced, according to the 
above described methods. 

Example 4 - Isolating and Producing the DNA which Codes for the Angiosperm P450-2 Gene 
[0089] By using similar strategy of synthesizing PCR primers from the published literature 
for hydroxylase genes in plants, another full length P450 cDNA has been isolated that shows 
significant similarity with a putitive F5H clone from Arabidopsis (Meyers et al. 1996: PNAS 
93, 6869-6874). This cloned cDNA, designated P450-2, contains 1883 bp and encodes an 
open reading frame of 51 1 amino acids. The amino acid similarity shared between 
Arabidopsis FSH and the P450-2 sweetgum clone is about 75%. 

[0090] To confirm the function of the P450-2 gene, it was expressed in E.coli, strain, DH5 
alpha, via pQE vector preparation, according to directions available with the kit. A CO— 
Fe2+binding assay was also performed to confirm the expression of P450-2 as a functional 
P450 gene. (Omura & Sato 1964, J. of Biochemistry 239: 2370-2378, Babriac et.al. 1991 
Archives of Biochemistry and Biophysics 288:302-309). The CO-Fe2+ binding assay 
showed a peak at 450 nm which indicates that P450-2 has been overexpressed as a functional 
P450 gene. 

[0091] The P450-2 protein was further purified for production of antibodies in rabbits, and 
antibodies have been successfully produced. In addition, Western blots show that this 
antibody is specific to the membrane fraction of sweetgum and aspen xylem extract. When 
the P450-2 antibody was added to a reaction mixture containing aspen xylem tissue, enzyme 
inhibition studies showed that the activity of P450 in aspen was reduced more than 60%, a 
further indication that P450-2 performs a p4501ike function. Recombinant P450-2 protein co- 
expressed with Arabidopsis CPR protein in a baculovirus expression system hydroxylated 
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ferulic acid (specific activity: 7.3 pKat/mg protein), cinnaminic acid (specific activity: 25 
pKat/mg protein, and p-coumeric acid (specific activity 3.8 pKat/ng protein). The P450-2 
enzyme which may be referred to as C4C3F5-H appears to be a broad spectrum hydroxylase 
in the phenyproponoid pathway in plants FIG.5 (SEQ ID 3 and 4) illustrates the P450-2 
sequence. 

EXAMPLE 5 - Identifying Gymnosperm Promoter Regions 

[0092] In order to identify gymnosperm promoter regions, sequences from loblolly pine 
PAL and CL1B and 4CL3B lignin genes were used as primers to screen the loblolly pine 
genomic library, using the Genome Walker Kit. The loblolly pine PAL primer sequence was 
obtained from the GenBank, reference number U39792. The loblolly pine 4CL1B primer 
sequences were also obtained from the gene bank, reference numbers U39404 and U39405. 
[0093] The loblolly pine genomic library was constructed in Lambda Dashll, available 
from Stratagene, of LaJolla, Calif. 3xl0 6 phage plaques from the genomic library of loblolly 
pine were screened using both the above mentioned PAL cDNA and 4CL (PCR clone) 
fragments as probes. Five 4CL clones were obtained after screening. Lambda DNAs of two 
4CL of the five 4CL clones obtained after screening were isolated and digested by EcoRV, 
Pstd, Sail and Xbal for Southern analysis. Southern analysis using 4CL fragments as probes 
indicated that both clones for the 4CL gene were identical. Results from further mapping 
showed that none of the original five 4CL clones contained promoter regions. When tested, 
the PAL clones obtained from the screening also did not contain promoter regions. 
[0094] In a second attempt to clone the promoter regions associated with the PAL and 4CL 
a Universal GenomeWalker.TM. kit, available from CLONETECH, was used. In the process, 
total DNA from loblolly pine was digested by several restriction enzymes and ligated into the 
adaptors (libraries) provided with the kit. Two gene-specific primers for each gene were 
designed (GSP1 and 2). After two rounds of PCR using these primers and adapter primers of 
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the kit, several fragments were amplified from each library. A 1.6 kb fragment and a 0.6 kb 

fragment for PAL gene and a 2.3 kb fragment (4CL1B) and a 0.7 kb fragment (4CL3B) for 

the 4CL gene were cloned, sequenced and found to contain promoter regions for all three 

genes. See FIG. 6 (SEQ ID 10), 7 (SEQ ID 1 1) and 8 (SEQ ID 9). 

Example 6 - Fusing the ASL DNA Sequence to A Constitutive Promoter Region and 
Inserting the Expression Cassette Into a Gymnosperm Genome 

[0095] As a first step, a ASL DNA sequence, P450-1 , was fused with a constitutive 
promoter region according to the methods described in the above Section IV to form an 
P450-1 expression cassette. A second ASL DNA sequence, P450-2, was then fused with a 
constitutive promoter in the same manner to form an P450-2 expression cassette. The P450-1 
expression cassette was inserted into the gymnosperm genome by micro-projectile 
bombardment. Embryogenic tissue cultures of loblolly pine were initiated from immature 
zygotic embryos. The tissue was maintained in an undifferentiated state on semi-solid 
proliferation medium, according to methods described by Newton et al. TAES Technical 
Publication "Somatic Embryogenesis in Slash Pine", 1995 and Keinonen-Mettala et al. 1996, 
Scand. J. For. Res. 1 1 : 242-250. 

[0096] After separation, 5 ml of the liquid cell suspension fraction which passes through the 
40 mesh screen was vacuum deposited onto filter paper and placed on semi-solid 
proliferation medium. The prepared gymnosperm target cells were then grown for 2 days on 
filter paper discs placed on semi-solid proliferation medium in a petri dish. These target cell 
were then bombarded with plasmid DNA containing the P450-1 expression cassette and an 
expression cassette containing a selectable marker gene encoding the enzyme which confers 
resistance to the antibiotic hygromycin B. A 1:1 mixture of of selectable marker expression 
cassette and plasmid DNA containing the P450-1 expression cassette is precipitated with gold 
(1.5-3.0 microns) as described by Sanford et al. (1992). The DNA-coated microprojectiles 
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were rinsed in absolute ethanol and aliquots of 10 ^1 (5 |ng DNA/3 mg gold) were dried onto 
a macrocarrier, such as those available from BioRad (Hercules, Calif.). 
[0097] Prior to bombardment, embryogenic tissue was desiccated under a sterile laminar- 
flow hood for 5 minutes. The desiccated tissue was transferred to semi-solid proliferation 
medium. The microprojectiles were accelerated into desiccated target cells using a BioRad 
PDS-1000/HE particle gun. 

[0098] Each plate was bombarded once, rotated 1 80 degrees, and bombarded a second time. 
Preferred bombardment parameters were 1350 psi rupture disc pressure, 6 mm distance from 
the rupture disc to macrocarrier (gap distance), 1 cm macrocarrier travel distance, and 10 cm 
distance from macrocarrier stopping screen to culture plate (microcarrier travel distance). 
Tissue was then transferred to semi-solid proliferation medium containing hygromycin B for 
two days after bombardment. 

[0099] The P450-2 expression cassette was inserted into the gymnosperm genome 

according to the same procedures. 

Example 7 - Selecting Transformed Target Cells 

[0100] After insertion of the P450-1 expression cassette and the selectable marker 
expression cassette into the gymnosperm target cells as described in Example 6, transformed 
cells were selected by exposure to an antibiotic that causes mortality of any cells not 
containing the GSL expression cassette. Forty independent cell lines were established from 
cultures cobombarded with an expression cassette containing a hygromycin resistance gene 
construct and the P450-1 construct. These cell lines include lines Y2, Y17, Y7 and 04, as 
discussed in more detail below. 

[0101] PCR techniques were then used to verify that the P450-1 gene had been successfully 
integrated into the genomes of the established cell lines by extracting genomic DNA using 
the Plant DNAeasy kit, available from Quaigen. 200 ng DNA from each cell line were used 
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for each PCR reaction. Two P450-1 specific primers were designed to perform a PCR 
reaction with a 600 bp PCR product size. The primers were: 

LsP450-iml-S primer: ATGGCTTTCCTTCTAATACCCATCTC (SEQ ID 

23) , and 

LsP450-iml-A primer: GGGTGTAATGGACGAGCAAGGACTTG (SEQ ID 

24) . 

[0102] Each PCR reaction (100 |al) consisted of 75 [il H20, 1 jal MgCl (25 mM), 10 jxl PCR 
buffer 1 10 mM dNTPs, and 10 pi DNA. 100 jal oil was layered on the top of each reaction 
mix. Hot start PCR was done as follows: PCR reaction was incubated at 95 degrees C. for 7 
minutes and 1 pi each of both LsP450-iml-S and LsP450-iml-A primers (100 pM stock) and 
1 pi of Taq polymerase were added through oil in each reaction. The PCR program used was 
95 degrees C. for 1 .5 minutes, 55 degrees C. for 45 sec and 72 degrees C. for 2 minutes, 
repeated for 40 cycles, followed by extension at 72 degrees C. for 10 minutes. 
[0103] The above PCR products were employed to determine if gymnosperm cells 
contained the angiosperm lignin gene sequences. With reference to FIG. 9, PCR 
amplification was performed using template DNA from cells which grew vigorously on 
hygromycin B-containing medium. The PCR products were electrophoresed in an agarose gel 
containing 9 lanes. Lanes 14 contained PCR amplification of products of the Sweetgum 
P450-1 gene from a non-transformed control and transgenic loblolly pine cell lines. Lane 1 
contained the non-transformed control PT52. Lane 2 contained transgenic line Y2. Lane 3 
contained transgenic line Y17 and Lane 4 contained the plasmid which contains the 
expression cassette pSSLsP4501-im-s. Lanes 2 through 4 all contain an amplified fragment of 
about 600 bp, indicating that the P450-1 gene has been successfully inserted into transgenic 
cell lines Y2 and Y 17. 
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[0104] Lane 5 contained a DNA size marker Phi 174/HaeII (BRL). The top four bands in 
this lane indicate molecular sizes of 1353, 1078, 872 and 603 bp. 

[0105] Lanes 6-9 contained PCR amplification products of hygromycin B gene from non- 
transformed control and transgenic loblolly pine cell lines. Lane 6 contained the non- 
transformed control lane referenced to as PTS. Lane 7 contained transgenic line Y7. Lane 8 
contained transgenic line 04. Lane 9 contained the plasmid which includes the expression 
cassette containing the gene encoding the enzyme which confers resistance tot he antibiotic 
hygromycin B. Lanes 7-9 all show an amplified fragment of about 1000 bp, indicating that 
the hygromycin gene has been successfully inserted into transgenic lines Y7 and 04. 
[0106] These PCR results confirmed the presence of P450-1 and hygromycin resistance 
gene in transformed loblolly pine cell cultures. The results obtained from the PCR 
verification of 4 cell lines, and similar tests with the remaining 36 cell lines, confirm stable 
integration of the P450-1 gene and the hygromycin B gene in 25% of the 40 cell lines. 
[0107] In addition, loblolly pine embryogenic cells which have been co-bombarded with 
the P450-2 and hygromycin B expression cassettes, are growing vigorously on hygromycin 
selection medium, indicating that the P450-2 expression cassette was successfully integrated 
into the gymnosperm genome. 

[0108] Although various embodiments and features of the invention have been described in 
the foregoing detailed description, those of ordinary skill will recognize the invention is 
capable of numerous modifications, rearrangements and substitutions without departing from 
the scope of the invention as set forth in the appended claims. For example, in the case where 
the lignin DNA sequence is transcribed and translated to produce a functional syringyl lignin 
gene, those of ordinary skill will recognize that because of codon degeneracy a number of 
polynucleotide sequences will encode the same gene. These variants are intended to be 
covered by the DNA sequences disclosed and claimed herein. In addition, the sequences 
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claimed herein include those sequences with encode a gene having substantial functional 
identity with those claimed. Thus, in the case of syringyl lignin genes, for example, the DNA 
sequences include variant polynucleotide sequences encoding polypeptides which have 
substantial identity with the amino acid sequence of syringyl lignin and which show syringyl 
lignin activity in gymnosperms. 
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SEQUENCE LISTING 



<110> CHIANG, VINCENT L. 

CARRAWAY, DANIEL T. 
SMELTZER, RICHARD H. 

<12 0> PRODUCTION OF SYRINGYL LIGNIN IN GYMNOS PERMS 

<130> 044463-0336 

<140> 10/681,878 
<141> 2003-10-09 

<150> 09/796,256 
<151> 2001-02-28 

<150> 08/991,677 
<151> 1997-12-16 

<150> 60/033,381 
<151> 1996-12-16 

<160> 24 

<170> Patentln Ver. 3.3 

<210> 1 

<211> 1708 

<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (48) . . (1571) 

<400> 1 

cggcacgagg aaaccctaaa actcacctct cttacccttt ctcttca atg get ttc 56 

Met Ala Phe 
1 

ctt eta ata ccc ate tea ata ate 
Leu Leu lie Pro lie Ser lie lie 
5 10 

tat caa egg etc aga ttt aag etc 
Tyr Gin Arg Leu Arg Phe Lys Leu 
20 25 

ate gtc gga aac ctt tac gac ata 
He Val Gly Asn Leu Tyr Asp He 
40 

gee gag tgg tea caa gcg tac ggt 
Ala Glu Trp Ser Gin Ala Tyr Gly 
55 



ttc ate gtc 
Phe He Val 



cca ccc ggc 
Pro Pro Gly 
30 

aaa ccg gtg 
Lys Pro Val 
45 

ccg ate ata 
Pro He He 
60 



tta get tac 
Leu Ala Tyr 
15 

cca cgt cca 
Pro Arg Pro 



agg ttc egg 
Arg Phe Arg 



teg gtg tgg 
Ser Val Trp 
65 



cag etc 104 
Gin Leu 



tgg ccg 152 
Trp Pro 
35 

tgt ttc 200 
Cys Phe 
50 

ttc ggt 248 
Phe Gly 
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tea acg ttg aat gtg ate gta teg aat teg gaa ttg get aag gaa gtg 296 
Ser Thr Leu Asn Val lie Val Ser Asn Ser Glu Leu Ala Lys Glu Val 
70 75 80 



etc aag gaa aaa gat caa caa ttg get gat agg cat agg agt aga tea 344 
Leu Lys Glu Lys Asp Gin Gin Leu Ala Asp Arg His Arg Ser Arg Ser 
85 90 95 

get gee aaa ttt age agg gat ggg cag gac ctt ata tgg get gat tat 392 
Ala Ala Lys Phe Ser Arg Asp Gly Gin Asp Leu lie Trp Ala Asp Tyr 
100 105 110 115 

gga cct cac tat gtg aag gtt aca aag gtt tgt ace etc gag ctt ttt 440 
Gly Pro His Tyr Val Lys Val Thr Lys Val Cys Thr Leu Glu Leu Phe 
120 125 13 0 

act cca aag egg ctt gaa get ctt aga ccc att aga gaa gat gaa gtt 488 
Thr Pro Lys Arg Leu Glu Ala Leu Arg Pro lie Arg Glu Asp Glu Val 
135 140 145 

aca gee atg gtt gag tec att ttt aat gac act gcg aat cct gaa aat 536 
Thr Ala Met Val Glu Ser lie Phe Asn Asp Thr Ala Asn Pro Glu Asn 
150 155 160 

tat ggg aag agt atg ctg gtg aag aag tat ttg gga gca gta gca ttc 5 84 
Tyr Gly Lys Ser Met Leu Val Lys Lys Tyr Leu Gly Ala Val Ala Phe 
165 170 175 

aac aac att aca aga etc gca ttt gga aag cga ttc gtg aat tea gag 632 
Asn Asn lie Thr Arg Leu Ala Phe Gly Lys Arg Phe Val Asn Ser Glu 
180 185 190 195 



ggt gta atg gac gag caa gga ctt gaa ttt aag gaa att gtg gee aat 
Gly Val Met Asp Glu Gin Gly Leu Glu Phe Lys Glu lie Val Ala Asn 
200 205 210 

gga etc aag ctt ggt gee tea ctt gca atg get gag cac att cct tgg 
Gly Leu Lys Leu Gly Ala Ser Leu Ala Met Ala Glu His He Pro Trp 
215 220 225 



680 



728 



etc cgt tgg atg ttc cca ctt gag gaa ggg gee ttt gee aag cat ggg 776 
Leu Arg Trp Met Phe Pro Leu Glu Glu Gly Ala Phe Ala Lys His Gly 
230 235 240 



824 



gca cgt agg gac cga ctt ace aga get ate atg gaa gag cac aca ata 
Ala Arg Arg Asp Arg Leu Thr Arg Ala He Met Glu Glu His Thr He 
245 250 255 

gee cgt aaa aag agt ggt gga gee caa caa cat ttc gtg gat gca ttg 872 
Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val Asp Ala Leu 
260 265 270 275 

etc ace eta caa gag aaa tat gac ctt age gag gac act att att ggg 920 
Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr He lie Gly 
280 285 290 
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etc ctt tgg gat atg ate act gca ggc atg gac aca acc gca ate tct 
Leu Leu Trp Asp Met He Thr Ala Gly Met Asp Thr Thr Ala He Ser 
295 3 00 3 05 



968 



1064 



1112 



1160 



1208 



gtc gaa tgg gee atg gee gag tta att aag aac cca agg gtg caa caa 1016 
Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg Val Gin Gin 
310 315 320 

aaa get caa gag gag eta gac aat gta ctt ggg tec gaa cgt gtc ctg 
Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu Arg Val Leu 
325 330 335 

acc gaa ttg gac ttc tea age etc cct tat eta caa tgt gta gee aag 
Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys Val Ala Lys 
340 345 350 355 

gag gca eta agg ctg cac cct cca aca cca eta atg etc cct cat cgc 
Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu Pro His Arg 
360 365 370 

gec aat gee aac gtc aaa att ggt ggc tac gac ate cct aag gga tea 
Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp lie Pro Lys Gly Ser 
375 380 385 

aat gtt cat gta aat gtc tgg gee gtg get cgt gat cca gca gtg tgg 1256 
Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro Ala Val Trp 
390 395 400 

cgt gac cca eta gag ttt cga ccg gaa egg ttc tct gaa gac gat gtc 1304 
Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu Asp Asp Val 
405 410 415 

gac atg aaa ggt cac gat tat agg eta ctg ccg ttt ggt gca ggg agg 1352 
Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly Ala Gly Arg 
420 425 430 435 

cgt gtt tgc ccc ggt gca caa ctt ggc ate aat ttg gtc aca tec atg 
Arg Val Cys Pro Gly Ala Gin Leu Gly lie Asn Leu Val Thr Ser Met 
440 445 450 

atg ggt cac eta ttg cac cat ttc tat tgg age cct cct aaa ggt gta 
Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro Lys Gly Val 
455 460 465 

aaa cca gag gag att gac atg tea gag aat cca gga ttg gtc acc tac 
Lys Pro Glu Glu lie Asp Met Ser Glu Asn Pro Gly Leu Val Thr Tyr 
470 475 480 

atg cga acc ccg gtg caa get gtt ccc act cca agg ctg cct get cac 1544 
Met Arg Thr Pro Val Gin Ala Val Pro Thr Pro Arg Leu Pro Ala His 
485 490 495 

ttg tac aaa cgt gta get gtg gat atg taattcttag tttgttatta 1591 
Leu Tyr Lys Arg Val Ala Val Asp Met 
500 505 

ttcatgetet taaggttttg gactttgaac ttatgatgag atttgtaaaa ttccaagtga 1651 



1400 



1448 



1496 



31 



j 



tcaaatgaag aaaagaccaa ataaaaaggc ttgacgattt aaaaaaaaaa aaaaaaa 1708 



<210> 2 

<211> 508 

<212> PRT 

<213> Liquidambar styraciflua 



<400> 2 

Met Ala Phe Leu 
1 

Tyr Gin Leu Tyr 
20 

Pro Trp Pro lie 
35 

Arg Cys Phe Ala 
50 

Trp Phe Gly Ser 
65 

Lys Glu Val Leu 



Ser Arg Ser Ala 
100 

Ala Asp Tyr Gly 
115 

Glu Leu Phe Thr 
130 

Asp Glu Val Thr 
145 

Pro Glu Asn Tyr 



Val Ala Phe Asn 
180 

Asn Ser Glu Gly 
195 

Val Ala Asn Gly 
210 

lie Pro Trp Leu 
225 

Lys His Gly Ala 



Leu lie Pro lie 
5 

Gin Arg Leu Arg 



Val Gly Asn Leu 
40 

Glu Trp Ser Gin 
55 

Thr Leu Asn Val 
70 

Lys Glu Lys Asp 
85 

Ala Lys Phe Ser 



Pro His Tyr Val 
120 

Pro Lys Arg Leu 
135 

Ala Met Val Glu 
150 

Gly Lys Ser Met 
165 

Asn lie Thr Arg 



Val Met Asp Glu 
200 

Leu Lys Leu Gly 
215 

Arg Trp Met Phe 
230 

Arg Arg Asp Arg 
245 



Ser lie lie Phe 
10 

Phe Lys Leu Pro 
25 

Tyr Asp lie Lys 



Ala Tyr Gly Pro 
60 

lie Val Ser Asn 
75 

Gin Gin Leu Ala 
90 

Arg Asp Gly Gin 
105 

Lys Val Thr Lys 



Glu Ala Leu Arg 
140 

Ser lie Phe Asn 
155 

Leu Val Lys Lys 
170 

Leu Ala Phe Gly 
185 

Gin Gly Leu Glu 



Ala Ser Leu Ala 
220 

Pro Leu Glu Glu 
235 

Leu Thr Arg Ala 
250 



lie Val Leu Ala 
15 

Pro Gly Pro Arg 
30 

Pro Val Arg Phe 
45 

lie He Ser Val 



Ser Glu Leu Ala 
80 

Asp Arg His Arg 
95 

Asp Leu He Trp 
110 

Val Cys Thr Leu 
125 

Pro He Arg Glu 



Asp Thr Ala Asn 
160 

Tyr Leu Gly Ala 
175 

Lys Arg Phe Val 
190 

Phe Lys Glu He 
205 

Met Ala Glu His 



Gly Ala Phe Ala 
240 

He Met Glu Glu 
255 



32 



His Thr lie Ala Arg Lys Lys Ser Gly Gly Ala Gin Gin His Phe Val 
260 265 270 



Asp Ala Leu Leu Thr Leu Gin Glu Lys Tyr Asp Leu Ser Glu Asp Thr 
275 280 285 

lie He Gly Leu Leu Trp Asp Met He Thr Ala Gly Met Asp Thr Thr 
290 295 300 

Ala He Ser Val Glu Trp Ala Met Ala Glu Leu He Lys Asn Pro Arg 
305 310 315 320 

Val Gin Gin Lys Ala Gin Glu Glu Leu Asp Asn Val Leu Gly Ser Glu 
325 330 335 

Arg Val Leu Thr Glu Leu Asp Phe Ser Ser Leu Pro Tyr Leu Gin Cys 
340 345 350 

Val Ala Lys Glu Ala Leu Arg Leu His Pro Pro Thr Pro Leu Met Leu 
355 360 365 

Pro His Arg Ala Asn Ala Asn Val Lys He Gly Gly Tyr Asp He Pro 
370 375 380 

Lys Gly Ser Asn Val His Val Asn Val Trp Ala Val Ala Arg Asp Pro 
385 ' 390 395 400 

Ala Val Trp Arg Asp Pro Leu Glu Phe Arg Pro Glu Arg Phe Ser Glu 
4 05 410 415 

Asp Asp Val Asp Met Lys Gly His Asp Tyr Arg Leu Leu Pro Phe Gly 
42 0 4 25 43 0 

Ala Gly Arg Arg Val Cys Pro Gly Ala Gin Leu Gly He Asn Leu Val 
435 440 445 

Thr Ser Met Met Gly His Leu Leu His His Phe Tyr Trp Ser Pro Pro 
450 455 460 

Lys Gly Val Lys Pro Glu Glu He Asp Met Ser Glu Asn Pro Gly Leu 
465 " ~ 470 475 480 

Val Thr Tyr Met Arg Thr Pro Val Gin Ala Val Pro Thr Pro Arg Leu 
485 490 495 

Pro Ala His Leu Tyr Lys Arg Val Ala Val Asp Met 
500 505 



<210> 3 
<211> 1883 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 
<221> CDS 

<222> (74) . . (1606) 



33 



J 



<400> 3 

tgcaaacctg cacaaacaaa gagagagaag aagaaaaagg aagagaggag agagagagag 60 



agagagagaa gcc atg gat tct tct ctt cat gaa gcc ttg caa cca eta 109 
Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu 
1 5 10 

ccc atg acg ctg ttc ttc att ata cct ttg eta etc tta ttg ggc eta 157 
Pro Met Thr Leu Phe Phe lie lie Pro Leu Leu Leu Leu Leu Gly Leu 
15 20 25 



gta tct egg ctt cgc cag aga eta cca tac cca cca ggc cca aaa ggc 
Val Ser Arg Leu Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly 
30 35 40 



gtc gac teg gca gta cga gtg gtc gcg tec aat att ggg teg acg gtg 
Val Asp Ser Ala Val Arg Val Val Ala Ser Asn lie Gly Ser Thr Val 
160 165 170 



gcg get ttt ggg acg ate teg cat gag gac cag gac gag ttc gtg gcc 
Ala Ala Phe Gly Thr lie Ser His Glu Asp Gin Asp Glu Phe Val Ala 
190 195 200 



205 



349 



397 



tta ccg gtg ate gga aac atg etc atg atg gat caa etc act cac cga 253 
Leu Pro Val lie Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg 
45 50 55 60 

gga etc gcc aaa etc gcc aaa caa tac ggc ggt eta ttc cac etc aag 3 01 
Gly Leu Ala Lys Leu Ala Lys Gin Tyr Gly Gly Leu Phe His Leu Lys 
65 70 75 

atg gga ttc tta cac atg gtg gcc gtt tec aca ccc gac atg get cgc 
Met Gly Phe Leu His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg 
80 85 90 

caa gtc ctt caa gtc caa gac aac ate ttc teg aac egg cca gcc ace 
Gin Val Leu Gin Val Gin Asp Asn lie Phe Ser Asn Arg Pro Ala Thr 
95 100 105 

ata gcc ate age tac etc ace tat gac cga gcc gac atg gcc ttc get 445 
lie Ala He Ser Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala 
110 115 120 

cac tac ggc ccg ttt tgg cgt cag atg cgt aaa etc tgc gtc atg aaa 
His Tyr Gly Pro Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys 
125 130 135 140 

tta ttt age egg aaa cga gcc gag teg tgg gag teg gtc cga gac gag 541 
Leu Phe Ser Arg Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu 
145 150 155 



493 



589 



aat ate ggc gag ctg gtt ttt get ctg acg aag aat att act tac agg 637 
Asn lie Gly Glu Leu Val Phe Ala Leu Thr Lys Asn lie Thr Tyr Arg 
175 180 185 



685 
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J 



ata ctg caa gag ttt teg cag ctg ttt ggt get ttt aat ata get gat 
lie Leu Gin Glu Phe Ser Gin Leu Phe Gly Ala Phe Asn lie Ala Asp 

205 210 215 220 

ttt ate cct tgg etc aaa tgg gtt cct cag ggg att aac gtc agg etc 
Phe lie Pro Trp Leu Lys Trp Val Pro Gin Gly lie Asn Val Arg Leu 

225 230 235 

aac aag gca cga ggg gcg ctt gat ggg ttt att gac aag ate ate gac 
Asn Lys Ala Arg Gly Ala Leu Asp Gly Phe lie Asp Lys lie lie Asp 

240 245 250 

gat cat ata cag aag ggg agt aaa aac teg gag gag gtt gat act gat 

Asp His lie Gin Lys Gly Ser Lys Asn Ser Glu Glu Val Asp Thr Asp 

255 260 265 

atg gta gat gat tta ctt get ttt tac ggt gag gaa gee aaa gta age 

Met Val Asp Asp Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser 

270 275 280 

gaa tct gac gat ctt caa aat tec ate aaa etc acc aaa gac aac ate 

Glu Ser Asp Asp Leu Gin Asn Ser lie Lys Leu Thr Lys Asp Asn lie 

285 290 295 300 

aaa get ate atg gac gta atg ttt gga ggg acc gaa acg gtg gcg tec 

Lys Ala lie Met Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser 

305 310 315 

gcg att gaa tgg gee atg acg gag ctg atg aaa age cca gaa gat eta 

Ala lie Glu Trp Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu 

320 325 330 

aag aag gtc caa caa gaa etc gee gtg gtg gtg ggt ctt gac egg cga 

Lys Lys Val Gin Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg 

335 340 345 

gtc gaa gag aaa gac ttc gag aag etc acc tac ttg aaa tgc gta ctg 

Val Glu Glu Lys Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu 

350 355 360 

aag gaa gtc ctt cgc etc cac cca ccc ate cca etc etc etc cac gag 

Lys Glu Val Leu Arg Leu His Pro Pro lie Pro Leu Leu Leu His Glu 

365 370 375 380 

act gee gag gac gee gag gtc ggc ggc tac tac att ccg gcg aaa teg 

Thr Ala Glu Asp Ala Glu Val Gly Gly Tyr Tyr He Pro Ala Lys Ser 

385 390 395 

egg gtg atg ate aac gcg tgc gee ate ggc egg gac aag aac teg tgg 

Arg Val Met He Asn Ala Cys Ala He Gly Arg Asp Lys Asn Ser Trp 

400 405 410 

gec gac cca gat acg ttt agg ccc tec agg ttt etc aaa gac ggt gtg 

Ala Asp Pro Asp Thr Phe Arg Pro Ser Arg Phe Leu Lys Asp Gly Val 

415 420 425 



35 



ccc gat ttc aaa ggg aac aac ttc gag ttc ate cca ttc ggg tea ggt 
Pro Asp Phe Lys Gly Asn Asn Phe Glu Phe He Pro Phe Gly Ser Gly 
430 435 440 



1405 



cgt egg tct tgc ccc ggt atg caa etc gga etc tac gcg eta gag acg 1453 
Arg Arg Ser Cys Pro Gly Met Gin Leu Gly Leu Tyr Ala Leu Glu Thr 
445 450 455 460 

act gtg get cac etc ctt cac tgt ttc acg tgg gag ttg ccg gac ggg 1501 
Thr Val Ala His Leu Leu His Cys Phe Thr Trp Glu Leu Pro Asp Gly 
465 470 475 



atg aaa ccg agt gaa etc gag atg aat gat gtg ttt gga etc ace gcg 
Met Lys Pro Ser Glu Leu Glu Met Asn Asp Val Phe Gly Leu Thr Ala 
480 485 490 



1549 



cca aga gcg att cga etc acc gee gtg ccg agt cca cgc ctt etc tgt 1597 
Pro Arg Ala He Arg Leu Thr Ala Val Pro Ser Pro Arg Leu Leu Cys 
495 500 505 

cct etc tat tgatcgaatg attgggggag ctttgtggag gggcttttat 1646 
Pro Leu Tyr 
510 

ggagactcta tatatagatg ggaagtgaaa caacgacagg tgaatgcttg gatttttggt 1706 
atatattggg gagggagggg aaaaaaaaaa taatgaaagg aaagaaaaga gagaatttga 176 6 
atttctcttc ctctgtggat aaaagecteg tttttaattg tttttatgtg gagatatttg 1826 
tgtttgttta tttttatctc tttttttgea ataacactca aaaataaaaa aaaaaaa 1883 



<210> 4 
<211> 511 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 4 

Met Asp Ser Ser Leu His Glu Ala Leu Gin Pro Leu Pro Met Thr Leu 
X J 5 10 15 

Phe Phe He He Pro Leu Leu Leu Leu Leu Gly Leu Val Ser Arg Leu 
20 25 30 

Arg Gin Arg Leu Pro Tyr Pro Pro Gly Pro Lys Gly Leu Pro Val He 
35 40 45 

Gly Asn Met Leu Met Met Asp Gin Leu Thr His Arg Gly Leu Ala Lys 
50 55 60 

Leu Ala Lys Gin Tyr Gly Gly Leu Phe His Leu Lys Met Gly Phe Leu 
65 70-75 8 0 

His Met Val Ala Val Ser Thr Pro Asp Met Ala Arg Gin Val Leu Gin 
85 90 95 
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Val Gin Asp Asn lie Phe Ser Asn Arg Pro Ala Thr lie Ala lie Ser 
100 105 HO 



Tyr Leu Thr Tyr Asp Arg Ala Asp Met Ala Phe Ala His Tyr Gly Pro 
115 120 125 

Phe Trp Arg Gin Met Arg Lys Leu Cys Val Met Lys Leu Phe Ser Arg 
130 135 140 

Lys Arg Ala Glu Ser Trp Glu Ser Val Arg Asp Glu Val Asp Ser Ala 
145 150 155 160 

Val Arg Val Val Ala Ser Asn lie Gly Ser Thr Val Asn lie Gly Glu 
165 170 175 

Leu Val Phe Ala Leu Thr Lys Asn lie Thr Tyr Arg Ala Ala Phe Gly 
180 185 190 

Thr lie Ser His Glu Asp Gin Asp Glu Phe Val Ala lie Leu Gin Glu 
195 200 205 

Phe Ser Gin Leu Phe Gly Ala Phe Asn lie Ala Asp Phe He Pro Trp 
210 215 220 

Leu Lys Trp Val Pro Gin Gly He Asn Val Arg Leu Asn Lys Ala Arg 
225 ^ 230 235 240 

Gly Ala Leu Asp Gly Phe He Asp Lys He He Asp Asp His He Gin 
245 250 255 

Lys Gly Ser Lys Asn Ser Glu Glu Val Asp Thr Asp Met Val Asp Asp 
260 265 270 

Leu Leu Ala Phe Tyr Gly Glu Glu Ala Lys Val Ser Glu Ser Asp Asp 
275 280 285 

Leu Gin Asn Ser He Lys Leu Thr Lys Asp Asn He Lys Ala He Met 
290 295 300 

Asp Val Met Phe Gly Gly Thr Glu Thr Val Ala Ser Ala He Glu Trp 
305 310 315 320 

Ala Met Thr Glu Leu Met Lys Ser Pro Glu Asp Leu Lys Lys Val Gin 
325 330 335 

Gin Glu Leu Ala Val Val Val Gly Leu Asp Arg Arg Val Glu Glu Lys 
340 345 350 

Asp Phe Glu Lys Leu Thr Tyr Leu Lys Cys Val Leu Lys Glu Val Leu 
355 360 365 

Arg Leu His Pro Pro He Pro Leu Leu Leu His Glu Thr Ala Glu Asp 
370 375 380 

Ala Glu Val Gly Gly Tyr Tyr lie Pro Ala Lys Ser Arg Val Met He 
385 390 395 400 
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I 

I 



Asn Ala Cys Ala 



Thr Phe Arg Pro 
420 

Gly Asn Asn Phe 
435 

Pro Gly Met Gin 
450 

Leu Leu His Cys 
465 

Glu Leu Glu Met 



Arg Leu Thr Ala 
500 



lie Gly Arg Asp 
405 

Ser Arg Phe Leu 



Glu Phe lie Pro 
440 

Leu Gly Leu Tyr 
455 

Phe Thr Trp Glu 
470 

Asn Asp Val Phe 
485 

Val Pro Ser Pro 



Lys Asn Ser Trp 
410 

Lys Asp Gly Val 
425 

Phe Gly Ser Gly 



Ala Leu Glu Thr 
460 

Leu Pro Asp Gly 
475 

Gly Leu Thr Ala 
490 

Arg Leu Leu Cys 
505 



Ala Asp Pro Asp 
415 

Pro Asp Phe Lys 
430 

Arg Arg Ser Cys 
445 

Thr Val Ala His 



Met Lys Pro Ser 
480 

Pro Arg Ala lie 
495 

Pro Leu Tyr 
510 



<210> 5 
<211> 1380 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (67) . . (1170) 

<400> 5 

cggcacgagc cctacctcct ttcttggaaa aatttcccca ttcgatcaca atccgggcct 60 



caaaaa atg gga tea aca age gaa acg aag atg age ccg agt gaa gca 
Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala 
15 10 

gca gca gca gaa gaa gaa gca ttc gta ttc get atg caa tta ace agt 
Ala Ala Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser 
15 20 25 30 

get tea gtt ctt ccc atg gtc eta aaa tea gee ata gag etc gac gtc 
Ala Ser Val Leu Pro Met Val Leu Lys Ser Ala lie Glu Leu Asp Val 
35 40 45 



108 



156 



204 



tta gaa ate atg get aaa get ggt cca ggt gcg cac ata tec aca tct 252 
Leu Glu lie Met Ala Lys Ala Gly Pro Gly Ala His lie Ser Thr Ser 
50 55 60 

gac ata gee tct aag ctg ccc aca aag aat cca gat gca gee gtc atg 300 
Asp lie Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met 
65 70 75 

ctt gac cgt atg etc cgc etc ttg get age tac tct gtt eta acg tgc 348 
Leu Asp Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys 
80 85 90 



38 



I 

i 
i 



tct etc cgc acc etc cct gac ggc aag ate gag agg ctt tac ggc ctt 
Ser Leu Arg Thr Leu Pro Asp Gly Lys lie Glu Arg Leu Tyr Gly Leu 
95 ~ 100 105 110 

gca ccc gtt tgt aaa ttc ttg acc aga aac gat gat gga gtc tec ata 
Ala Pro Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser lie 
115 120 125 



gaa gca ctt cca acc aat ggg aag gtg ate ctt get gaa tgc ate etc 
Glu Ala Leu Pro Thr Asn Gly Lys Val He Leu Ala Glu Cys He Leu 
290 295 300 

ccc gtg gcg cca gac gca age etc ccc act aag gca gtg gtc cat att 
Pro Val Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He 
3 05 310 315 



396 



444 



588 



636 



684 



gee get ctg tct etc atg aat caa gac aag gtc etc atg gag age tgg 492 
Ala Ala Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp 
130 135 140 

tac cac ttg acc gag gca gtt ctt gaa ggt gga att cca ttt aac aag 540 
Tyr His Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys 
145 150 155 

gee tat gga atg aca gca ttt gag tac cat ggc acc gat ccc aga ttc 
Ala Tyr Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg Phe 
160 165 170 

aac aca gtt ttc aac aat gga atg tec aat cat teg acc att acc atg 
Asn Thr Val Phe Asn Asn Gly Met Ser Asn His Ser Thr He Thr Met 
175 180 185 190 

aag aaa ate ctt gag act tac aaa ggg ttc gag gga ctt gga tct gtg 
Lys Lys He Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val 
195 200 205 

gtt gat gtt ggt ggt ggc act ggt gee cac ctt aac atg att ate get 732 
Val Asp Val Gly Gly Gly Thr Gly Ala His Leu Asn Met He He Ala 
210 215 220 

aaa tac ccc atg ate aag ggc att aac ttc gac ttg cct cat gtt att 780 
Lys Tyr Pro Met He Lys Gly He Asn Phe Asp Leu Pro His Val He 
225 230 235 

gag gag get ccc tec tat cct ggt gtg gag cat gtt ggt gga gat atg 
Glu Glu Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met 
240 245 250 

ttt gtt agt gtt cca aaa gga gat gee att ttc atg aag tgg ata tgt 
Phe Val Ser Val Pro Lys Gly Asp Ala He Phe Met Lys Trp He Cys 
255 260 265 270 

cat gat tgg age gat gaa cac tgc ttg aag ttt ttg aag aaa tgt tat 924 
His Asp Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr 
275 280 285 



828 



876 



972 



1020 



39 



I 



1220 



gat gtc ate atg ttg get cat aac cca ggt ggg aaa gag aga act gag 1.068 
Asp Val lie Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu 
320 325 330 

aag gag ttt gag gec ttg gec aag ggg get gga ttt gaa ggt ttc cga 1116 
Lys Glu Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg 
335 340 345 350 

gta gta gec teg tgc get tac aat aca tgg ate ate gaa ttt ttg aag 1164 
Val Val Ala Ser Cys Ala Tyr Asn Thr Trp lie lie Glu Phe Leu Lys 
355 360 365 

aag att tgagtcctta cteggctttg agtacataat accaactcct tttggttttc 
Lys lie 

gagattgtga ttgtgattgt gattgtctct etttegcagt tggccttatg atataatgta 1280 
tegttaaetc gatcacagaa gtgeaaaaga cagtgaatgt acactgettt ataaaataaa 1340 
aattttaaga ttttgattca tgtaaaaaaa aaaaaaaaaa 1380 

<210> 6 
<211> 368 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 6 

Met Gly Ser Thr Ser Glu Thr Lys Met Ser Pro Ser Glu Ala Ala Ala 
15 10 15 

Ala Glu Glu Glu Ala Phe Val Phe Ala Met Gin Leu Thr Ser Ala Ser 
20 25 30 

Val Leu Pro Met Val Leu Lys Ser Ala He Glu Leu Asp Val Leu Glu 
35 40 45 

He Met Ala Lys Ala Gly Pro Gly Ala His He Ser Thr Ser Asp He 
50 55 60 

Ala Ser Lys Leu Pro Thr Lys Asn Pro Asp Ala Ala Val Met Leu Asp 
65 70 75 80 

Arg Met Leu Arg Leu Leu Ala Ser Tyr Ser Val Leu Thr Cys Ser Leu 
85 90 95 

Arg Thr Leu Pro Asp Gly Lys He Glu Arg Leu Tyr Gly Leu Ala Pro 
100 105 HO 

Val Cys Lys Phe Leu Thr Arg Asn Asp Asp Gly Val Ser He Ala Ala 
115 120 125 

Leu Ser Leu Met Asn Gin Asp Lys Val Leu Met Glu Ser Trp Tyr His 
130 135 140 

Leu Thr Glu Ala Val Leu Glu Gly Gly He Pro Phe Asn Lys Ala Tyr 
145 150 155 160 



40 : 

i 



Gly Met Thr Ala Phe Glu Tyr His Gly Thr Asp Pro Arg Phe Asn Thr 
165 170 175 

Val Phe Asn Asn Gly Met Ser Asn His Ser Thr lie Thr Met Lys Lys 
180 185 190 

lie Leu Glu Thr Tyr Lys Gly Phe Glu Gly Leu Gly Ser Val Val Asp 
195 200 205 

Val Gly Gly Gly Thr Gly Ala His Leu Asn Met lie lie Ala Lys Tyr 
210 215 220 

Pro Met lie Lys Gly lie Asn Phe Asp Leu Pro His Val lie Glu Glu 
225 230 235 240 

Ala Pro Ser Tyr Pro Gly Val Glu His Val Gly Gly Asp Met Phe Val 
245 250 255 

Ser Val Pro Lys Gly Asp Ala lie Phe Met Lys Trp lie Cys His Asp 
260 265 270 

Trp Ser Asp Glu His Cys Leu Lys Phe Leu Lys Lys Cys Tyr Glu Ala 
275 280 285 

Leu Pro Thr Asn Gly Lys Val lie Leu Ala Glu Cys He Leu Pro Val 
290 295 300 

Ala Pro Asp Ala Ser Leu Pro Thr Lys Ala Val Val His He Asp Val 
305 ~ 310 315 320 

He Met Leu Ala His Asn Pro Gly Gly Lys Glu Arg Thr Glu Lys Glu 
325 330 335 * 

Phe Glu Ala Leu Ala Lys Gly Ala Gly Phe Glu Gly Phe Arg Val Val 
340 345 350 

Ala Ser Cys Ala Tyr Asn Thr Trp He lie Glu Phe Leu Lys Lys He 
355 360 365 



<210> 7 
<211> 2025 
<212> DNA 

<213> Liquidambar styraciflua 

<220> 

<221> CDS 

<222> (60) - . (1679) 

<400> 7 

cggcacgagc tcattttcca cttctggttt gatctctgca attcttccat cagtcccta 

atg gag acc caa aca aaa caa gaa gaa ate ata tat egg teg aaa etc 
Met Glu Thr Gin Thr Lys Gin Glu Glu He He Tyr Arg Ser Lys Leu 
15 10 15 



41 



ccc gat ate tac ate ccc aaa cac etc cct tta cat teg tat tgt ttc 
Pro Asp lie Tyr lie Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 



155 



gag aac ate tea cag ttc ggc tec cgc ccc tgt ctg ate aat ggc gca 
Glu Asn lie Ser Gin Phe Gly Ser Arg Pro Cys Leu lie Asn Gly Ala 
35 40 45 



203 



acg ggc aag tat tac aca tat get gag gtt gag etc att gcg cgc aag 
Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu lie Ala Arg Lys 
50 ' 55 60 



251 



gtc gca tec ggc etc aac aaa etc ggc gtt cga caa ggt gac ate ate 
Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp lie lie 
65 70 75 80 



299 



atg ctt ttg eta ccc aac teg ccg gag ttc gtg ttt tea att etc ggc 
Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser lie Leu Gly 
85 90 95 



347 



gca tec tac cgc ggg get gee gee acc gee gca aac ccg ttt tat ace 
Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 



395 



cct gee gag ate agg aag caa gee aaa acc tec aac gee agg ctt att 
Pro Ala Glu lie Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu lie 
115 120 125 



443 



ate aca cat gee tgt tac tat gag aaa gtg aag gac ttg gtg gaa gag 
lie Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 



491 



aac gtt gee aag ate ata tgt ata gac tea ccc ccg gac ggt tgt ttg 
Asn Val Ala Lys lie lie Cys lie Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 



539 



cac ttc teg gag ctg agt gag gcg gac gag aac gac atg ccc aat gta 
His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 



587 



gag att gac ccc gat gat gtg gtg gcg ctg ccg tac teg tea ggg acg 
Glu He Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
180 185 190 



635 



acg ggt tta cca aag ggg gtg atg eta aca cac aag gga caa gtg acg 
Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 



683 



agt gtg gcg caa cag gtg gac gga gag aat ccg aac ctg tat ata cat 
Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 



731 



age gag gac gtg gtt ctg tgc gtg ttg cct ctg ttt cac ate tac teg 
Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 



779 



42 



atg aac gtc atg ttt tgc ggg tta cga gtt ggt gcg gcg att ctg att 827 
Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala lie Leu lie 
245 250 255 

atg cag aaa ttt gaa ata tat ggg ttg tta gag ctg gtc aga agt aca 875 
Met Gin Lys Phe Glu He Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

ggt gac cat cat gcc tat cgt aca ccc ate gta ttg gca ate tec aag 923 
Gly Asp His His Ala Tyr Arg Thr Pro lie Val Leu Ala lie Ser Lys 
275 280 285 

act ccg gat ctt cac aac tat gat gtg tec tec att egg act gtc atg 971 
Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser lie Arg Thr Val Met 
290 295 300 

tea ggt gcg get cct ctg ggc aag gaa ctt gaa gat tct gtc aga get 1019 
Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 310 315 320 

aag ttt ccc acc gcc aaa ctt ggt cag gga tat gga atg acg gag gca 1067 
Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

ggg ccc gtg eta gcg atg tgt ttg gca ttt gcc aag gaa ggg ttt gaa 1115 
Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

ata aaa teg ggg gca tct gga act gtt tta agg aac gca cag atg aag 1163 
lie Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

att gtg gac cct gaa acc ggt gtc act etc cct cga aac caa ccc gga 1211 
lie Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

gag att tgc att aga gga gac caa ate atg aaa ggt tat ctt aat gat 1259 
Glu lie Cys lie Arg Gly Asp Gin lie Met Lys Gly Tyr Leu Asn Asp 
385 390 395 400 

cct gag gcg acg gag aga acc ata gac aag gaa ggt tgg tta cac aca 13 07 
Pro Glu Ala Thr Glu Arg Thr He Asp Lys Glu Gly Trp Leu His Thr 
405 410' 415 

ggt gat gtg ggc tac ate gac gat gac act gag etc ttc att gtt gat 1355 
Gly Asp Val Gly Tyr He Asp Asp Asp Thr Glu Leu Phe He Val Asp 
420 425 430 



egg ttg aag gaa ctg ate aaa tac aaa ggg ttt cag gtg gca ccc get 
Arg Leu Lys Glu Leu He Lys Tyr Lys Gly Phe Gin Val Ala Pro Ala 
435 440 445 



1403 



gag ctt gag gcc atg etc etc aac cat ccc aac ate tct gat get gcc 1451 
Glu Leu Glu Ala Met Leu Leu Asn His Pro Asn He Ser Asp Ala Ala 
450 455 460 



43 



) 



gtc gtc cca atg aaa gac gat gaa get gga gag etc cct gtg gcg ttt 
Val Val Pro Met Lys Asp Asp Glu Ala Gly Glu Leu Pro Val Ala Phe 
465 470 475 480 



1499 



gtt gta aga tea gat ggt tct cag ata tec gag get gaa ate agg caa 1547 
Val Val Arg Ser Asp Gly Ser Gin lie Ser Glu Ala Glu lie Arg Gin 
485 490 495 

tac ate gca aaa cag gtg gtt ttt tat aaa aga ata cat cgc gta ttt 1595 
Tyr lie Ala Lys Gin Val Val Phe Tyr Lys Arg lie His Arg Val Phe 
500 505 510 

ttc gtc gaa gee att cct aaa gcg ccc tct ggc aaa ate ttg egg aag 1643 
Phe Val Glu Ala lie Pro Lys Ala Pro Ser Gly Lys lie Leu Arg Lys 
515 520 525 

gac ctg aga gee aaa ttg gcg tct ggt ctt ccc aat taattctcat 1689 
Asp Leu Arg Ala Lys Leu Ala Ser Gly Leu Pro Asn 
530 535 540 

tcgctaccct cctttctctt atcatacgcc aacacgaacg aagaggctca attaaacget 1749 

gctcattcga ageggctcaa ttaaagctgc tcattcatgt ccaccgagtg ggcagcctgt 1809 

cttgttggga tgttctttca tttgattcag ctgtgagaag ccagaccctc attatttatt 1869 

gtgaaattca caagaatgtc tgtaaatcga tgttgtgagt gatgggtttc aaaacacttt 1929 

tgacattgtt tacgttgtat ttcctgctgt tgaaaataac tactttgtat gacttttatt 1989 

tgggaagata acctttcaaa aaaaaaaaaa aaaaaa 2025 



<210> 8 
<211> 540 
<212> PRT 

<213> Liquidambar styraciflua 
<400> 8 

Met Glu Thr Gin Thr Lys Gin Glu Glu lie He Tyr Arg Ser Lys Leu 
15 10 15 

Pro Asp He Tyr He Pro Lys His Leu Pro Leu His Ser Tyr Cys Phe 
20 25 30 

Glu Asn He Ser Gin Phe Gly Ser Arg Pro Cys Leu He Asn Gly Ala 
35 40 45 

Thr Gly Lys Tyr Tyr Thr Tyr Ala Glu Val Glu Leu He Ala Arg Lys 
50 55 6 0 

Val Ala Ser Gly Leu Asn Lys Leu Gly Val Arg Gin Gly Asp He He 
65 A 70 75 80 

Met Leu Leu Leu Pro Asn Ser Pro Glu Phe Val Phe Ser He Leu Gly 
85 90 95 
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Ala Ser Tyr Arg Gly Ala Ala Ala Thr Ala Ala Asn Pro Phe Tyr Thr 
100 105 110 

Pro Ala Glu lie Arg Lys Gin Ala Lys Thr Ser Asn Ala Arg Leu lie 
115 120 125 

lie Thr His Ala Cys Tyr Tyr Glu Lys Val Lys Asp Leu Val Glu Glu 
130 135 140 

Asn Val Ala Lys lie He Cys He Asp Ser Pro Pro Asp Gly Cys Leu 
145 150 155 160 

His Phe Ser Glu Leu Ser Glu Ala Asp Glu Asn Asp Met Pro Asn Val 
165 170 175 

Glu He Asp Pro Asp Asp Val Val Ala Leu Pro Tyr Ser Ser Gly Thr 
180 185 190 

Thr Gly Leu Pro Lys Gly Val Met Leu Thr His Lys Gly Gin Val Thr 
195 200 205 

Ser Val Ala Gin Gin Val Asp Gly Glu Asn Pro Asn Leu Tyr He His 
210 215 220 

Ser Glu Asp Val Val Leu Cys Val Leu Pro Leu Phe His He Tyr Ser 
225 230 235 240 

Met Asn Val Met Phe Cys Gly Leu Arg Val Gly Ala Ala lie Leu He 
245 250 255 

Met Gin Lys Phe Glu He Tyr Gly Leu Leu Glu Leu Val Arg Ser Thr 
260 265 270 

Gly Asp His His Ala Tyr Arg Thr Pro He Val Leu Ala He Ser Lys 
275 280 285 

Thr Pro Asp Leu His Asn Tyr Asp Val Ser Ser He Arg Thr Val Met 
290 295 300 

Ser Gly Ala Ala Pro Leu Gly Lys Glu Leu Glu Asp Ser Val Arg Ala 
305 ' 310 315 320 

Lys Phe Pro Thr Ala Lys Leu Gly Gin Gly Tyr Gly Met Thr Glu Ala 
325 330 335 

Gly Pro Val Leu Ala Met Cys Leu Ala Phe Ala Lys Glu Gly Phe Glu 
340 345 350 

lie Lys Ser Gly Ala Ser Gly Thr Val Leu Arg Asn Ala Gin Met Lys 
355 360 365 

lie Val Asp Pro Glu Thr Gly Val Thr Leu Pro Arg Asn Gin Pro Gly 
370 375 380 

Glu lie Cys lie Arg Gly Asp Gin He Met Lys Gly Tyr Leu Asn Asp 
385 " 390 395 400 
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Pro Glu Ala Thr Glu Arg Thr lie Asp Lys Glu Gly Trp Leu His Thr 
405 410 415 



Gly Asp Val Gly 
420 

Arg Leu Lys Glu 
435 

Glu Leu Glu Ala 
450 

Val Val Pro Met 
465 

Val Val Arg Ser 



Tyr lie Ala Lys 
500 

Phe Val Glu Ala 
515 

Asp Leu Arg Ala 
530 



Tyr lie Asp Asp 



Leu lie Lys Tyr 
440 

Met Leu Leu Asn 
455 

Lys Asp Asp Glu 
470 

Asp Gly Ser Gin 
485 

Gin Val Val Phe 



lie Pro Lys Ala 
520 

Lys Leu Ala Ser 
535 



Asp Thr Glu Leu 
425 

Lys Gly Phe Gin 



His Pro Asn lie 
460 

Ala Gly Glu Leu 
475 

lie Ser Glu Ala 
490 

Tyr Lys Arg lie 
505 

Pro Ser Gly Lys 



Gly Leu Pro Asn 
540 



Phe lie Val Asp 
430 

Val Ala Pro Ala 
445 

Ser Asp Ala Ala 



Pro Val Ala Phe 
480 

Glu lie Arg Gin 
495 

His Arg Val Phe 
510 

lie Leu Arg Lys 
525 



<210> 9 

<211> 1544 

<212> DNA 

<213 > Pinus taeda 

<400> 9 

aaagataata tatgtgtatg cctactacta 
caacactagg aggactcaca atgagcactt 
atattagtga aagctagtta aactaacccc 
ctactacgtc ttcctctttt tgtctttctc 
caaatgtaaa attaaacctt gaaacttgta 
ttatgacaac atatatacac caacccattg 
aacgaaagaa acgctgtctc accaactcgt 
gatacagatt gaagagccga aaaaagcgtg 
aaaaacgcgt gcgcctaatt tttttgagat 
ttcacgtgtc gcgtattggc gaggttgcgc 
attccattgg ttgacccgcc ggtaccgcga 
gtggatcagc actgagaa'ga ttagatgatg 
gggtggttgg caagtacgcg acaaagaggg 
ataatattac aaagtgggtt ggtgggcatg 
tccgtgcaaa ttctgaccag tagtttgaac 
tgaagtgggt aaggagaatt gaacttacgt 
aacacatacc tttaactaat aaaaataccc 
cagaccttca actaataaga tagccatcag 
tgacttcaac caactaagat acccatcaaa 
cttaccagac caaccaagca gacctacgcc 
gtgccaccgt tgaagaatgg cactcagggt 
gtttggtgga gacggcgtgt ttgaatgtcc 
gcttatatta ggcctggatc tcttgtttca 



cacattgttt tgaagtgtgt aaacatagtg 60 
gttgacatga aactagctaa atgcccaaca 120 
tttgactttc aagatgatat atttatatcc 180 
ttgtgattaa accttccttg aaacaattct 240 
gagaccaaac ttccctagga gaaaccacat 3 00 
catactataa tattggaatt acctgcagcg 360 
gcactacatc ccgaaactta accttcccct 420 
catccaaatt tctggtatgg tgaggagccg 4 80 
gggccggaaa ataatgcgtg catctaaatt 54 0 
tgaatgtgat cctgtgcgtg agccacattc 600 
ggaccgtggg gtctcacaga tacgcggatg 660 
accaggcggg catttgaagt aaaaacttgg 72 0 
gtagtgcgca aggaagcgag ttggatgcaa 78 0 
agcatcaacc agaatgatgt tgttgctggt 840 
aatactaccc aacttgtttt tggtaaaaca 900 
ctcatggtaa agggcaaggg caaatgactt 96 0 
ctaacaaata cgaaaacgaa tgagttatca 1020 
acccacatct cctgactgac caaaaacaaa 1080 
gctaacccac aacccaattc ctcacttccc 1140 
attaactact ttaggacgtg ggaattgggg 1200 
tggtaatccc tccacgtgta tgtagcagtc 1260 
accttccagt ttggagaaca aggaaattgg 1320 
gagcaggagt agttcaggac aggaactagc 13 80 
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attcaagaat tcaattgccc tgccctgctc tgctctgctt tgctcaactt attgatccct 1440 
gctctggttt gttcaatttc ttgacccctg ctgggttctg ctctggtttg cacactttct 1500 
cgattatata agtcattttg gatccttgca aggaagagaa tatg 1544 



<210> 10 

<211> 659 

<212> DNA 

<213> Pinus taeda 

<400> 10 

aaacaccaat ttaatgggat ttcagatttg tatcccatgc tattggctaa ggcatttttc 60 
ttattgtaat ctaaccaatt ctaatttcca ccctggtgtg aactgactga caaatgcggt 120 
ccgaaaacag cgaatgaaat gtctgggtga tcggtcaaac aagcggtggg cgagagagcg 18 0 
cgggtgttgg cctagccggg atgggggtag gtagacggcg tattaccggc gagttgtccg 24 0 
aatggagttt tcggggtagg tagtaacgta gacgtcaatg gaaaaagtca taatctccgt 300 
caaaaatcca accgctcctt cacatcgcag agttggtggc cacgggaccc tccacccact 360 
cactcaatcg atcgcctgcc gtggttgccc attattcaac catacgccac ttgactcttc 420 
accaacaatt ccaggccggc tttctataca atgtactgca caggaaaatc caatataaaa 480 
agccggcctc tgcttccttc tcagtagccc ccagctcatt caattcttcc cactgcaggc 540 
tacatttgtc agacacgttt tccgccattt ttcgcctgtt tctgcggaga atttgatcag 600 
gttcggattg ggattgaatc aattgaaagg tttttatttt cagtatttcg atcgccatg 659 



<210> 11 

<211> 2251 

<212> DNA 

<213> Pinus taeda 

<400> 11 

ggccgggtgg tgacatttat tcataaattc atctcaaaac aagaaggatt tacaaaaata 60 
aaagaaaaca aaattttcat ctttaacata attataattg tgttcacaaa attcaaactt 120 
aaacccttaa tataaagaat ttctttcaac aatacacttt aatcacaact tcttcaatca 180 
caacctcctc caacaaaatt aaaatagatt aataaataaa taaacttaac tatttaaaaa 240 
aaaatattat acaaaattta ttaaaacttc aaaataaaca aactttttat acaaaattca 300 
tcaaaacttt aaaataaagc taaacactga aaatgtgagt acatttaaaa ggacgctgat 360 
cacaaaaatt ttgaaaacat aaacaaactt gaaactctac cttttaagaa tgagtttgtc 420 
gtctcattaa ctcattagtt ttatagttcg aatccaatta acgtatcttt tattttatgg 480 
aataagggtg ttttaataag tgattttggg atttttttag taatttattt gtgatatgtt 540 
atggagtttt taaaaatata tatatatata tatatttttg ggttgagttt acttaaaatt 600 
tggaaaaggt tggtaagaac tataaattga gttgtgaatg agtgttttat ggatttttta 660 
agatgttaaa tttatatatg taattaaaat tttattttga ataacaaaaa ttataattgg 720 
ataaaaaatt gttttgttaa atttagagta aaaatttcaa aatctaaaat aattaaacac 780 
tattattttt aaaaaatttg ttggtaaatt ttatcttata tttaagttaa aatttagaaa 840 
aaattaattt taaattaata aacttttgaa gtcaaatatt ccaaatattt tccaaaatat 900 
taaatctatt ttgcattcaa aatacaattt aaataataaa acttcatgga atagattaac 960 
caatttgtat aaaaaccaaa aatctcaaat aaaatttaaa ttacaaaaca ttatcaacat 1020 
tatgatttca agaaagacaa taaccagttt ccaataaaat aaaaaacctc atggcccgta 1080 
attaagatct cattaattaa ttcttatttt ttaatttttt tacatagaaa atatctttat 1140 
attgtatcca agaaatatag aatgttctcg tccagggact attaatctcc aaacaagttt 1200 
caaaatcatt acattaaagc tcatcatgtc atttgtggat tggaaattat attgtataag 1260 
agaaatatag aatgttctcg tctagggact attaatttcc aaacaaattt caaaatcatt 1320 
acattaaagc tcatcatgtc atttgtggat tggaaattag acaaaaaaaa tcccaaatat 1380 
ttctctcaat ctcccaaaat atagttcgaa ctccatattt ttggaaattg agaatttttt 1440 
tacccaataa tatatttttt tatacatttt agagattttc cagacatatt tgctctggga 1500 
tttattggaa tgaaggttga gttataaact ttcagtaatc caagtatctt cggtttttga 1560 
agatactaaa tccattatat aataaaaaca cattttaaac accaatttaa tgggatttca 1620 
gatttgtatc ccatgctatt ggctaaggca tttttcttat tgtaatctaa ccaattctaa 1680 
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tttccaccct 
gggtgatcgg 
gggtaggtag 
aacgtagacg 
tcgcagagt t 
ccattattca 
caatgtactg 
ccccagctca 
ttttcgcctg 
ggtttttatt 



ggtgtgaact 
tcaaacaagc 
acggcgtatt 
tcaatggaaa 
ggtggccacg 
accatacgcc 
cacaggaaaa 
ttcaattctt 
tttctgcgga 
ttcagtattt 



gactgacaaa 

ggtgggcgag 

accggcgagt 
aagtcataat 
ggaccctcca 
acttgactct 
tccaatataa 
cccactgcag 
gaatttgatc 
cgatcgccat 



tgcggtccga 
agagcgcggg 
tgtccgaatg 
ctccgtcaaa 
cccactcact 
tcaccaacaa 
aaagccggcc 
gctacatttg 
aggttcggat 

g 



aaacagcgaa 
tgttggccta 
gagttttcgg 
aatccaaccg 
cgatcgcctg 
ttccaggccg 
tctgcttcct 
tcagacacgt 
tgggattgaa 



tgaaatgtct 
gccgggatgg 
ggtaggtagt 
ctccttcaca 
ccgtggttgc 
gctttctata 
tctcagtagc 
tttccgccat 
tcaattgaaa 



1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2251 



<210> 12 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 12 

Gly Gly Met Ala Thr Tyr Cys Cys Ala Thr Thr Tyr Ala Ala Cys Ala 
15 10 15 



Ala Gly Gly Cys 
20 



<210> 13 
<211> 20 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 13 

Ala Ala Ala Gly Ala Gly Ala Gly Asn Ala Cys Asn Asn Ala Asn Asn 
15 10 15 

Ala Asn Gly Ala 
20 



<210> 14 
<211> 31 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 14 

Thr Thr Gly Gly Ala Thr Cys Cys Gly Gly lie Ala Cys lie Ala Cys 
15 10 15 
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He Gly Gly He Tyr Thr He Cys Cys He Ala Ala Arg Gly Gly 
20 25 30 



<210> 15 
<211> 31 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 15 

Thr Thr Gly Gly Ala Thr Cys Cys Gly Thr He Gly Thr He Gly Cys 
15 10 15 

He Cys Ala Arg Cys Ala Arg Gly Thr He Gly Ala Tyr Gly Gly 
20 25 30 



<210> 16 
<211> 27 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 16 

Cys Cys He Cys Thr Tyr Thr Ala Asp Ala Cys Arg Thr Ala Asp Gly 
1 * 5 10 15 

Cys He Cys Cys Ala Gly Cys Thr Gly Thr Ala 
20 25 



<210> 17 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<400> 17 

tttttttttt tttta 



<210> 18 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<400> 18 

tttttttttt ttttc 



<210> 19 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<400> 19 

tttttttttt ttttg 



<210> 20 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<400> 20 

Cys Cys Asn Gly Gly Asn Gly Gly Ser Ala Arg Gly Ala 
1 ' 5 10 



<210> 21 
<211> 9 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
peptide 

<220> 

<221> M0D_RES 
<222> (3) 

<223> Variable amino acid 
<220> 

<221> M0D_RES 

<222> (5) . . (6) 

<223> Variable amino acid 

<220> 

<221> M0D_RES 
<222> (8) 

<223> Variable amino acid 
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<400> 21 

Phe Gly Xaa Gly Xaa Xaa Cys Xaa Gly 
1 5 



<210> 22 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<220> 

<221> modif ied_base 
<222> (23) 
<223> Inosine 

<400> 22 

atgtgcagtt tttttttttt ttnttt 



<210> 23 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<400> 23 

atggctttcc ttctaatacc catctc 



<210> 24 
<211> 26 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequence: Synthetic 
primer 

<400> 24 

gggtgtaatg gacgagcaag gacttg 
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